mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2011-02-04, 22:01   #419
KingKurly
 
KingKurly's Avatar
 
Sep 2010
Annapolis, MD, USA

3068 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Is there an easy way to find where the LL and DC wavefronts are?
Does http://mersenne.org/primenet/ help you?
KingKurly is offline   Reply With Quote
Old 2011-02-04, 22:55   #420
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

10B516 Posts
Default

Quote:
Originally Posted by KingKurly View Post
I does indeed, thank you kindly.
James Heinrich is offline   Reply With Quote
Old 2011-02-08, 00:06   #421
RichD
 
RichD's Avatar
 
Sep 2008
Kansas

59×67 Posts
Default Running low on LL candidates?

It seems the wavefront for TF is currently in the M80mil, while the wavefront for both P-1 and LL is about M53mil. Would it make sense to get the fine fellas over at the GPU farm to help with some M54-M70mil and run the TF to one (or two) bit levels higher than needed to off-set the P-1 work (at least, on a temporary basis)?

Yes, I know there is another TF level after P-1 but what about having them run an extra one or two levels?

It takes me nearly three days to complete a P-1. It seems it takes the GPU program a matter of hours to complete the next bit level. Would completing an extra bit level be comparable to a good P-1?

I admit, I don't know the math behind all of this.
RichD is offline   Reply With Quote
Old 2011-02-08, 00:59   #422
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

10000101101012 Posts
Default

Quote:
Originally Posted by RichD View Post
Would completing an extra bit level be comparable to a good P-1?
No. But an extra 4-5 levels might be.

A good P-1 is normally somewhere in the 5-7% chance-of-finding-a-factor range. Assuming that M53xxxxxx is normally TF'd to 2^68, if you take it an extra 4 levels to 2^72 you'll have a 5.75% chance of finding a factor for about 17GHz-days worth of work. An equivalent P-1 requires only 2.7GHz-days of effort. If you do 5 extra levels of TF, to 2^73, you get 7.14% chance of factor for 35GHz-days effort, compared to the equivalent P-1 which takes only 5.6GHz-days of effort.

So if you're talking about doing more TF vs P-1 on the same CPU, it's clear that P-1 is the winner (6x+ faster for same probability). However, it is entirely possible that many systems have GPUs that can pump out TF >6x faster than the CPU, so despite the apparent inefficiency, it could still be faster to get the same probability of factor doing TF than P-1. Not to mention that TF requires almost no RAM, whereas P-1 has extensive RAM requirements.
James Heinrich is offline   Reply With Quote
Old 2011-02-08, 01:43   #423
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

827910 Posts
Default

Quote:
Originally Posted by RichD View Post
Would completing an extra bit level be comparable to a good P-1?
GPU trial factorers are very welcome in the 50-60M range. If you polish off say 3 bits of TF, then P-1 will probably choose lower bounds, saving P-1 time. Plus, you'll find factors that some P-1'ers would have missed and even more factors that LL-testers-with-minimal-memory would have found. This will save LL testing time.

The calculations are horrific with weird feedback loops, so one cannot quantify any gains or say with any certainty what the optimal use of GPUs should be. However, I can say with certainty, the more GPUs, P-1ers, and LLers working below 60M the sooner we will finish this range and the sooner we will find the next Mersenne prime!
Prime95 is offline   Reply With Quote
Old 2011-02-08, 02:35   #424
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

10000101101012 Posts
Default

Quote:
Originally Posted by Prime95 View Post
GPU trial factorers are very welcome in the 50-60M range.
Would it be possible to have a manual assignment type of "GPU TF", which you (George) can tweak to be whatever is most useful at the moment. For now, that would be handing out exponents that are normally in the P-1 queue, for TF to 2 bits higher than normal (for example). Would simplify the choice for GPU-TF'ers to know where to be most useful. And with those exponents officially "assigned", they won't be accidentally handed out for P-1 at the lower bounds (or worse, LL) before the TF results come back.

Last fiddled with by James Heinrich on 2011-02-08 at 02:36
James Heinrich is offline   Reply With Quote
Old 2011-02-09, 04:56   #425
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

11001010010102 Posts
Default

Quote:
Originally Posted by Prime95 View Post
and the sooner we will find the next Mersenne prime!
I presume you refer to the 47+1.293th prime expected before 80M.

David

Last fiddled with by davieddy on 2011-02-09 at 05:00
davieddy is offline   Reply With Quote
Old 2011-02-09, 14:14   #426
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5×223 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Would it be possible to have a manual assignment type of "GPU TF", which you (George) can tweak to be whatever is most useful at the moment. For now, that would be handing out exponents that are normally in the P-1 queue, for TF to 2 bits higher than normal (for example). Would simplify the choice for GPU-TF'ers to know where to be most useful. And with those exponents officially "assigned", they won't be accidentally handed out for P-1 at the lower bounds (or worse, LL) before the TF results come back.
+1 on James suggestion. One or two bits above the usual limit.
E.g. an exponent which should be ususually TFed to 2^67 before P-1 and to 2^68 after P-1 ==> TF it to 2^{69,70} before P-1. I would like to see assignments of this type including multiple bitlevel (up to the last bitlevel) at once!

Oliver
TheJudger is offline   Reply With Quote
Old 2011-02-10, 11:19   #427
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts
Default

Quote:
Originally Posted by TheJudger View Post
+1 on James suggestion. One or two bits above the usual limit.
... once analysis, taking into account the relative speeds of GPU doing TF versus CPU doing L-L, has shown the benefit of doing so.

OTOH, once GPUs are doing L-L, the GPU:TF/GPU:LL tradeoff point changes again, perhaps back to near the CPU:TF/CPU:LL tradeoff.

Then there's the GPU:TF/CPU:P-1/GPU:LL tradeoff and the GPU:TF/GPU:P-1/CPU:LL tradeoff and the GPU:TF/GPU:P-1/GPU:LL tradeoff, plus the GPU:TF/CPU:P-1/GPU:TF/GPU:LL tradeoff and the GPU:TF/CPU:P-1/CPU:TF/GPU:LL tradeoff and the ...

Hmmm...

Last fiddled with by cheesehead on 2011-02-10 at 12:10 Reason: Once GPUs do both TF and L-L ... and P-1, ...
cheesehead is offline   Reply With Quote
Old 2011-02-10, 11:33   #428
firejuggler
 
firejuggler's Avatar
 
"Vincent"
Apr 2010
Over the rainbow

1011011010002 Posts
Default

On my GTX 460, M52100981 got from 69 to 72 in a bit less than 6 hours.
Quote:
firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 9.1794 no factor for M52100981 from 2^71 to 2^72 [mfaktc 0.14-Win barrett79_mul32]

firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 4.5897 no factor for M52100981 from 2^70 to 2^71 [mfaktc 0.14-Win barrett79_mul32]

firejuggler Manual testing 52100981 NF Feb 10 2011 10:59AM 0.0 2.2948 no factor for M52100981 from 2^69 to 2^70 [mfaktc 0.14-Win barrett79_mul32]


Last fiddled with by firejuggler on 2011-02-10 at 11:40
firejuggler is offline   Reply With Quote
Old 2011-02-10, 12:47   #429
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

7×13×47 Posts
Default

Quote:
Originally Posted by cheesehead View Post
... once analysis, taking into account the relative speeds of GPU doing TF versus CPU doing L-L, has shown the benefit of doing so.
Quote:
Originally Posted by firejuggler View Post
On my GTX 460, M52100981 got from 69 to 72 in a bit less than 6 hours.
Which is 16GHz-days of work, which means roughly 64GHz-days/day throughput.

For comparison, if you go for the top-end and get an i7-2600K overclocked to 4.5GHz, you'll get roughly 30GHz-days/day of LL/P-1 work across 4 cores, or 25GHz-days/day of TF work (above 2^66).

That is a "good" video card (GTX460) vs a "very good" CPU (SB i7-2600K OC).

Comparing a "good" CPU vs a "medium" GPU, let's compare my i7-920 @ 3.1GHz vs 8800GT:

CPU: 14GHz-days/day across 4 cores LL/P-1; 15GHz-days/day for TF
GPU: my current assignment of M332222641 from 2^76-77 will give 46GHz-days in 71 hours, so around 15.5GHz-days/day.

Remember, of course, that GPU-TF (using mfaktc) still chews up 1-2 cores of CPU time as well (faster card, more CPU required). Nevertheless, on my lower-end GPU it's still a 4x TF performance increase (one CPU core TF = 3.8GHzd/d; one CPU core + GPU TF = 15.5GHzd/d). On the high-end comparison, two CPU core TF = 12.5GHzd/d; two CPU core + GPU TF = 64GHzd/d for a 5x throughput increase, despite using an extra CPU core. (You could argue those cores are better at FFT, and might only be a 4x increase).

For a final comparison, let's compare a lower-end CPU with a higher-end GPU, and a top-end CPU with a lower-end GPU.
a) Let's first use firejuggler's GTX460 again with an i3-530 @ stock (3GHz): 6GHzd/d in LL/P-1; 7GHzd/d in TF. Both cores will be used feeding the GPU, but you get 64GHzd/d TF throughput for it, so a 9x-10x increase.
b) i7-2600K + 8800GT: you trade 7.3GHzd/d per CPU core of LL work for 15.5GHzd/d GPU-TF work. On a system level that's 30GHzd/d vs 22+15 so only a 1.25x overall throughput increase.

So from my limited sample of examined benchmarks, it can vary widely, from break-even (fast CPU, slow GPU) to 10x throughput increase (fast GPU, slow CPU).
James Heinrich is offline   Reply With Quote
Reply



All times are UTC. The time now is 13:06.


Fri Jul 7 13:06:21 UTC 2023 up 323 days, 10:34, 0 users, load averages: 0.77, 1.08, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔