![]() |
|
|
#331 |
|
Mar 2003
Melbourne
10038 Posts |
Historian - I've never seen someone reach so far to prove so little.
Please do not dismiss the good work msft has done. I think msft has done a fantastic job. I'm sorry but there appears to be a number of people here that just can't accept that for 2^n FFTs LL tests, GPUs are unbeaten for time to result (latency), results per time (throughput) and results per cost (both upfront and ongoing costs). -- Craig |
|
|
|
|
|
#332 |
|
Jan 2008
France
11258 Posts |
I fully agree with Nucleon. msft, don't listen to Historian, he belongs to 20th century; keep up the good work!
Last fiddled with by ldesnogu on 2010-09-23 at 09:10 |
|
|
|
|
|
#333 | |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
102678 Posts |
Quote:
There will always be people with better hardware that get more primes than other people. They are willing to pay more upfront and over time for that. Adding GPUs to the mix just makes a different sort of step up between different budgets. Last fiddled with by TimSorbet on 2010-09-23 at 11:58 |
|
|
|
|
|
|
#334 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
Quote:
To support all k values, you'll need to write C code or CUDA code to do the modular reduction at the same time as the carry propagation. This requires using FFTs that are twice the size as Mersenne numbers, zeroing the upper half of the FFT data. Thus, you can expect the LLR test time for a 12,500,000 bit number to be just a tad slower than the LL test time for a 25,000,000 bit number. |
|
|
|
|
|
|
#335 | |||
|
"Michael Kwok"
Mar 2006
100100111012 Posts |
Quote:
Quote:
Quote:
More than four years ago, the only machine I had was a (single core) Pentium 4. I found a top 5000 prime within a few months, and another one several months later. People started using Core 2 Duos, and then came core 2 quads, Phenom II's, and Core i7's. Despite this, both primes are still on the top 5000 list today, and I expect them to stay there at least until the end of the year. The difference between a Core i7 and a Pentium 4 is far greater than the difference between a Core i7 and a GPU. If the primes that I found back then are still on the top 5000 list today, I don't see why any primes found on my high-end CPU today will disappear from that list anytime soon. Like I said before, the additional computing power would be so little that it would hardly be worth the effort to develop a LLR GPU client. As Prime 95 said, "you can expect the LLR test time for a 12,500,000 bit number to be just a tad slower than the LL test time for a 25,000,000 bit number", so a GPU wouldn't even be able to match a high-end quad core if all cores were running. As for beating 6-core processors? Forget it. I don't have a CUDA capable GPU, and I wouldn't get one even if they were sold at the 99 cents store. |
|||
|
|
|
|
|
#336 | |
|
A Sunny Moo
Aug 2007
USA
142328 Posts |
Quote:
And besides, this only kicks in for k>50000 or so. Most of the k*2^n-1 testing being done at this time is below that, so even if a CUDA LLR program only supported k<50000, it would still be immensely useful. |
|
|
|
|
|
|
#338 | |||
|
Jun 2010
23×3×11 Posts |
Quote:
http://www.mersenneforum.org/showpos...83&postcount=3 Quote:
are getting slammed on: http://www.mersenneforum.org/showpos...0&postcount=21 Quote:
What's the rush? It's not like there's a lack of GPU work anyway - there's ppsieve, tpsieve, LL testing for mersenne numbers, and a trial division program that's used in Operation Billion Digits. |
|||
|
|
|
|
|
#339 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17×487 Posts |
Quote:
From a project admin's point of view, he'd rather GPUs did sieving than primality testing as it seems a GPU will greatly exceed (as opposed to modestly exceed) the thoughput of an i7. In any event, we are all better off with GPUs doing useful work rather than sitting idle! |
|
|
|
|
|
|
#340 | |
|
Jan 2008
France
25516 Posts |
Quote:
That being said, I think msft and TheJudger deserve respect for what they are doing. So when I read Vincent (aka Diep) post that seems to imply no amateur work has been done that shows GPU code faster than finely tuned CPU code that made me angry. I admit I was slightly over-reacting ![]() Anyway, what Historian wrote about msft in this thread is not acceptable. |
|
|
|
|
|
|
#341 | |
|
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
2E1616 Posts |
Quote:
Some crypto applications use only integer and logical operations on small word sizes are are embarassingly parallel. Examples include direct key search on simple block ciphers or LFSR-based stream ciphers, together with similar computations to build Hellman tables or rainbow tables. These typically run very quickly on a GPU. Paul |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |