![]() |
[QUOTE=James Heinrich;251318]Is there an easy way to find where the LL and DC wavefronts are?[/QUOTE]
Does [url]http://mersenne.org/primenet/[/url] help you? :smile: |
[QUOTE=KingKurly;251330]Does [url]http://mersenne.org/primenet/[/url] help you? :smile:[/QUOTE]I does indeed, thank you kindly.
|
Running low on LL candidates?
It seems the wavefront for TF is currently in the M80mil, while the wavefront for both P-1 and LL is about M53mil. Would it make sense to get the fine fellas over at the GPU farm to help with some M54-M70mil and run the TF to one (or two) bit levels higher than needed to off-set the P-1 work (at least, on a temporary basis)?
Yes, I know there is another TF level after P-1 but what about having them run an extra one or two levels? It takes me nearly three days to complete a P-1. It seems it takes the GPU program a matter of hours to complete the next bit level. Would completing an extra bit level be comparable to a good P-1? I admit, I don't know the math behind all of this. |
[QUOTE=RichD;251732]Would completing an extra bit level be comparable to a good P-1?[/QUOTE]No. But an extra 4-5 levels might be.
A good P-1 is normally somewhere in the 5-7% chance-of-finding-a-factor range. Assuming that M53xxxxxx is normally TF'd to 2^68, if you take it an extra 4 levels to 2^72 you'll have a [url=http://mersenne-aries.sili.net/credit.php?worktype=TF&exponent=53000000&frombits=68&tobits=72]5.75% chance[/url] of finding a factor for about 17GHz-days worth of work. An [url=http://mersenne-aries.sili.net/prob.php?prob=5.754&exponent=53000000]equivalent P-1[/url] requires only 2.7GHz-days of effort. If you do 5 extra levels of TF, to 2^73, you get [url=http://mersenne-aries.sili.net/credit.php?worktype=TF&exponent=53000000&frombits=68&tobits=73]7.14% chance of factor[/url] for 35GHz-days effort, compared to the [url=http://mersenne-aries.sili.net/prob.php?prob=7.14&exponent=53000000]equivalent P-1[/url] which takes only 5.6GHz-days of effort. So if you're talking about doing more TF vs P-1 [i]on the same CPU[/i], it's clear that P-1 is the winner (6x+ faster for same probability). However, it is entirely possible that many systems have GPUs that can pump out TF >6x faster than the CPU, so despite the apparent inefficiency, it could still be faster to get the same probability of factor doing TF than P-1. Not to mention that TF requires almost no RAM, whereas P-1 has extensive RAM requirements. |
[QUOTE=RichD;251732]Would completing an extra bit level be comparable to a good P-1?[/QUOTE]
GPU trial factorers are very welcome in the 50-60M range. If you polish off say 3 bits of TF, then P-1 will probably choose lower bounds, saving P-1 time. Plus, you'll find factors that some P-1'ers would have missed and even more factors that LL-testers-with-minimal-memory would have found. This will save LL testing time. The calculations are horrific with weird feedback loops, so one cannot quantify any gains or say with any certainty what the optimal use of GPUs should be. However, I can say with certainty, the more GPUs, P-1ers, and LLers working below 60M the sooner we will finish this range and the sooner we will find the next Mersenne prime! |
[QUOTE=Prime95;251742]GPU trial factorers are very welcome in the 50-60M range.[/QUOTE]Would it be possible to have a manual assignment type of "GPU TF", which you (George) can tweak to be whatever is most useful at the moment. For now, that would be handing out exponents that are normally in the P-1 queue, for TF to 2 bits higher than normal (for example). Would simplify the choice for GPU-TF'ers to know where to be most useful. And with those exponents officially "assigned", they won't be accidentally handed out for P-1 at the lower bounds (or worse, LL) before the TF results come back.
|
[QUOTE=Prime95;251742] and the sooner we will find the next Mersenne prime![/QUOTE]
I presume you refer to the 47+1.293th prime expected before 80M. David |
[QUOTE=James Heinrich;251751]Would it be possible to have a manual assignment type of "GPU TF", which you (George) can tweak to be whatever is most useful at the moment. For now, that would be handing out exponents that are normally in the P-1 queue, for TF to 2 bits higher than normal (for example). Would simplify the choice for GPU-TF'ers to know where to be most useful. And with those exponents officially "assigned", they won't be accidentally handed out for P-1 at the lower bounds (or worse, LL) before the TF results come back.[/QUOTE]
+1 on James suggestion. One or two bits above the usual limit. E.g. an exponent which should be ususually TFed to 2^67 before P-1 and to 2^68 after P-1 ==> TF it to 2^{69,70} before P-1. I would like to see assignments of this type including multiple bitlevel (up to the last bitlevel) at once! Oliver |
[QUOTE=TheJudger;251932]+1 on James suggestion. One or two bits above the usual limit.[/QUOTE]... once analysis, taking into account the relative speeds of GPU doing TF versus CPU doing L-L, has shown the benefit of doing so.
OTOH, once GPUs are doing L-L, the GPU:TF/GPU:LL tradeoff point changes again, perhaps back to near the CPU:TF/CPU:LL tradeoff. Then there's the GPU:TF/CPU:P-1/GPU:LL tradeoff and the GPU:TF/GPU:P-1/CPU:LL tradeoff and the GPU:TF/GPU:P-1/GPU:LL tradeoff, plus the GPU:TF/CPU:P-1/GPU:TF/GPU:LL tradeoff and the GPU:TF/CPU:P-1/CPU:TF/GPU:LL tradeoff and the ... Hmmm... |
On my GTX 460, M52100981 got from 69 to 72 in a bit less than 6 hours.
[quote] [FONT=Courier New][SIZE=2]firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 9.1794 no factor for M52100981 from 2^71 to 2^72 [mfaktc 0.14-Win barrett79_mul32] firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 4.5897 no factor for M52100981 from 2^70 to 2^71 [mfaktc 0.14-Win barrett79_mul32] firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 2.2948 no factor for M52100981 from 2^69 to 2^70 [mfaktc 0.14-Win barrett79_mul32] [/quote] [/SIZE][/FONT] |
[QUOTE=cheesehead;252041]... once analysis, taking into account the relative speeds of GPU doing TF versus CPU doing L-L, has shown the benefit of doing so.[/QUOTE][QUOTE=firejuggler;252043]On my GTX 460, M52100981 got from 69 to 72 in a bit less than 6 hours.[/QUOTE]Which is [url=http://mersenne-aries.sili.net/credit.php?worktype=TF&exponent=52100981&frombits=69&tobits=72]16GHz-days of work[/url], which means roughly 64GHz-days/day throughput.
For comparison, if you go for the top-end and get an [url=http://mersenne-aries.sili.net/throughput.php?cpu1=Intel%28R%29+Core%28TM%29+i7-2600K+CPU+%40+3.40GHz|256|8192&mhz1=4500]i7-2600K overclocked to 4.5GHz[/url], you'll get roughly 30GHz-days/day of LL/P-1 work across 4 cores, or 25GHz-days/day of TF work (above 2^66). That is a "good" video card (GTX460) vs a "very good" CPU (SB i7-2600K OC). Comparing a "good" CPU vs a "medium" GPU, let's compare my [url=http://mersenne-aries.sili.net/throughput.php?cpu1=Intel%28R%29+Core%28TM%29+i7+CPU+920+%40+2.67GHz|256|8192&mhz1=3100]i7-920 @ 3.1GHz[/url] vs 8800GT: CPU: 14GHz-days/day across 4 cores LL/P-1; 15GHz-days/day for TF GPU: my current assignment of [url=http://mersenne-aries.sili.net/credit.php?worktype=TF&exponent=332222641&frombits=76&tobits=77]M332222641 from 2^76-77[/url] will give 46GHz-days in 71 hours, so around 15.5GHz-days/day. Remember, of course, that GPU-TF (using mfaktc) still chews up 1-2 cores of CPU time as well (faster card, more CPU required). Nevertheless, on my lower-end GPU it's still a 4x TF performance increase (one CPU core TF = 3.8GHzd/d; one CPU core + GPU TF = 15.5GHzd/d). On the high-end comparison, two CPU core TF = 12.5GHzd/d; two CPU core + GPU TF = 64GHzd/d for a 5x throughput increase, despite using an extra CPU core. (You could argue those cores are better at FFT, and might only be a 4x increase). For a final comparison, let's compare a lower-end CPU with a higher-end GPU, and a top-end CPU with a lower-end GPU. a) Let's first use [i]firejuggler[/i]'s GTX460 again with an [url=http://mersenne-aries.sili.net/throughput.php?cpu1=Intel%28R%29+Core%28TM%29+i3+CPU+530+%40+2.93GHz|256|4096&mhz1=3000]i3-530 @ stock (3GHz)[/url]: 6GHzd/d in LL/P-1; 7GHzd/d in TF. Both cores will be used feeding the GPU, but you get 64GHzd/d TF throughput for it, so a 9x-10x increase. b) i7-2600K + 8800GT: you trade 7.3GHzd/d per CPU core of LL work for 15.5GHzd/d GPU-TF work. On a system level that's 30GHzd/d vs 22+15 so only a 1.25x overall throughput increase. So from my limited sample of examined benchmarks, it can vary widely, from break-even (fast CPU, slow GPU) to 10x throughput increase (fast GPU, slow CPU). |
| All times are UTC. The time now is 23:01. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.