![]() |
Need to get persistant storage working!
1 Attachment(s)
Been experimenting with CPU loads during GPU72_TF runs.
Need to get scalable persistent storage working! Just look at all those hours of a hungry CPU going without! :sad: (:wink:) |
A note for those running the mprime script
If you guys are running the mprime script with a custom worktodo.txt with small exponents (say, n < 100000) for running ECM, the browser will lag quite badly when in the tab running the exponent. To improve this, when doing the setup of the file prime.txt, echo the following into prime.txt:
[CODE]OutputIterations=n[/CODE]where n is a larger number than 10000, which is the default. This is not necessary at the current wavefront of ECM, or for any other jobs (except possibly the small Fermat numbers or small PRP checks for new Mersenne cofactors). |
Inter-section race condition...
I just thought I'd share a potential thing to consider when writing Notebooks with multiple Sections, intended to be quickly run in order.
As part of my SSHd payload, I was at the very end of the Perl script launching apt-get in the background ("$Cmd &"), to bring in useful tools for the console (like vi, emacs, mtr, nmon, etc). As I'm an advocate of "eat your own dog food", for each instance I get I've been running my Tunnel Section, and then immediately after my GPU72_TF Section. I've been *sometimes* noticing "weirdness" for the last day or so when launching these quickly. Turns out that my second Section runs apt itself, but ungracefully fails when it's told another apt job is already running. Somewhat amusingly, this started happening when I added emacs to the list of packages to install. I don't use it myself, but other's do. The problem is it's huge -- it was causing a much larger temporal window for the race condition of apt command availability. Something to consider if you're doing any concurrent work in these things. |
[QUOTE=Dylan14;528431]If you guys are running the mprime script with a custom worktodo.txt with small exponents (say, n < 100000) for running ECM, the browser will lag quite badly when in the tab running the exponent. To improve this, when doing the setup of the file prime.txt, echo the following into prime.txt:
[CODE]OutputIterations=n[/CODE]where n is a larger number than 10000, which is the default. This is not necessary at the current wavefront of ECM, or for any other jobs (except possibly the small Fermat numbers or small PRP checks for new Mersenne cofactors).[/QUOTE] What I do is add these lines to prime.txt: [CODE]ScaleOutputFrequency=1 OutputIterations=99999000[/CODE] This causes no progress output other than the beginning and ending messages from each curve being run. |
I'm not sure how Kaggle actually works. What is the difference between Notebook and Script? When I choose Python for both the same lines from Notebook does not work in Script.
I'm not actually using python just Linux commands with ! in front but that does not work in the Script section. When you "Commit" a notebook the "Interactive" version keeps running unless you choose "Run" and "Power Off", on the right side you can see how many Interactive and Committed kernels you have running. You can have 2 GPU + 10 CPU Committed as well as 1 GPU + 10 CPU Interactive. I think it uses up your GPU quota if you forget to Power off the Interactive GPU notebook... |
[QUOTE=ATH;528435]When I choose Python for both the same lines from Notebook does not work in Script.[/QUOTE]
[QUOTE]Patient: "It hurts when I do this. Doctor: "Don't do that.[/QUOTE] A more serious answer... I've never tried using Kaggle's Scripts. Nor its "R" language option. It would be useful to understand what selecting those options during an instance spin-up results in. |
kaggle update
I just completed my first ever exponent of PRP on the current wavefront on kaggle. Really happy with the performance despite the 30-hour quota per week (slightly disappointing but at least it's free). Besides the time it takes to update the dataset and execute the kernel, it's quite easy to set up after the initial setup. The exponent itself took slightly longer than 30 hours to finish, and that equates to 1 exp per week which isn't half bad.
Relating to the account ban, there's still no response from kaggle's support team. Emailed and requested numerous time over the entire week and no response at all. Oh well I guess the chance of getting my account back is pretty slim. |
There was a comment that TF sucks using Tesla K80, but do we have comparative figures to show that LL has a much better production throughput?
|
[QUOTE=bayanne;528469]There was a comment that TF sucks using Tesla K80, but do we have comparative figures to show that LL has a much better production throughput?[/QUOTE]The Tesla K80 has higher DP performance in relation to its SP performance, than the common consumer cards. Therefore LL or PRP or P-1 performance is better relative to its TF performance.
The Tesla K80 is a 2-gpu-unit card [URL]https://www.mersenne.ca/mfaktc.php[/URL] Tesla K80 766.7 GHz-days/day at TF (that's both units) [URL]https://www.mersenne.ca/cudalucas.php[/URL] Tesla K80 at 85M: 112.6 GHz-days/day at LL (again, both units) TF/LL =~6.81 In Colab runs, I see the K80 running near 400 GHz-days/day (on the one gpu; without tuning) It's not bad at TF. Relative to its TF speed, it is faster than most at other things. It also has quite a lot of gpu ram, which is useful for P-1, and unnecessary for TF. Compare to GTX1080: TF 1015.8 LL 63.2 ratio 16.07 Or RX480: TF 535.2 LL 40 ratio 13.38 RTX2080 TF 2703. LL 72.3 ratio 37.39 Radeon VII TF 1113.6 LL 274.8 ratio 4.05 Tesla C2075 TF 282.2 LL 21.7 ratio 13.0 For more, see [URL]https://www.mersenneforum.org/showpost.php?p=490612&postcount=3[/URL] |
[QUOTE=bayanne;528469]There was a comment that TF sucks using Tesla K80, but do we have comparative figures to show that LL has a much better production throughput?[/QUOTE]
I'm getting 2.82 msec/iter on a 50.8M LL double check, so roughly 40 hours to complete that. |
[QUOTE=bayanne;528469]There was a comment that TF sucks using Tesla K80, but do we have comparative figures to show that LL has a much better production throughput?[/QUOTE]
OK, is there a set of instructions for getting LL running on Colab or Kaggle? Chris's instructions for TF make it a doddle to get it up and running ... |
| All times are UTC. The time now is 22:55. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.