mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2008-10-01, 23:52   #1
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3×67 Posts
Default 100M madness

Is it really all that crazy to contemplate this? I've been reading posts that say it would take too long with current technology to try for a 100M digit monster. Help me out here:

My current E8500 can LL a 12M digit number in about 28 days (actually it can do 2LL's in that time, but just consider 1 for now). If we assume the calculation is linear (is it?), then we're looking at about 233 days (~8 months) for a 100M digit number. That's a long time, but...

Suppose we take things up a notch and run on a new Nehalem processor in a few months, where we're hoping to get true quad performance. One could run the 100M number on a single core, and run 3 other more reasonable LL's on the other cores, and still outperform todays best Penryn quad that people report getting about 3 cores-worth of performance. Nehalem may even shorten the 8 month estimate. Some benchmarks put it at 30% better performance than Penryn, but I have no idea what we'll actually see on prime95. If the 30% performance gain applies to prime95, 8 months would shrink below 6 months and one could do 2 100M numbers in a year on a single core!

All this boils down to linearity, so is it?
stars10250 is offline   Reply With Quote
Old 2008-10-02, 01:04   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·41·89 Posts
Default

Quote:
Originally Posted by stars10250 View Post
Is it linear?
No. The iteration speed is almost linear, but you need to run more iterations.

Also, Nehalem worries me. The 256KB L2 cache is *real* small.
Prime95 is offline   Reply With Quote
Old 2008-10-02, 01:33   #3
jinydu
 
jinydu's Avatar
 
Dec 2003
Hopefully Near M48

2·3·293 Posts
Default

It's definitely much harder than linear. Assuming that iteration time is proportional to FFT length, the smallest 100M digit exponent would take over 61.6 times longer than 2^43112609 - 1.

Take a look at this (http://www.mersenneforum.org/showthr...ies#post141900). Also see (http://www.mersenneforum.org/showthread.php?t=10660), post #6.

Last fiddled with by jinydu on 2008-10-02 at 01:35
jinydu is offline   Reply With Quote
Old 2008-10-02, 03:24   #4
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3·67 Posts
Default

Ok, I'm an idiot. All I had to do was use the benchmark page to estimate the time for my processor to test a 100M digit number. For my E8500, the estimate is 4 years, 150 days...a little long for my patience level and certainly not a linear extrapolation from my 12M digit numbers.

Regarding Nehalem, will the large L3 cache help offset the small L2? Presumably the Quickpath interconnect will get closer to all-core performance. I was really hoping to get one and move up the producers list and crush you guys. Don't burst my bubble!
stars10250 is offline   Reply With Quote
Old 2008-10-02, 03:32   #5
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

176A16 Posts
Default

Quote:
Originally Posted by stars10250 View Post
Presumably the Quickpath interconnect will get closer to all-core performance.
I was wondering this also. Is Windows smart enough to run on NUMA systems and make intelligent decisions about memory allocation and thread deployment?
retina is offline   Reply With Quote
Old 2008-10-02, 06:27   #6
Andi47
 
Andi47's Avatar
 
Oct 2004
Austria

9B216 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Also, Nehalem worries me. The 256KB L2 cache is *real* small.
Can P95 version 25.6 use L3-cache? The Nehalem-processors will have 4 - 12 (24 for servers) MB shared L3, depending on the processor-type.
Andi47 is offline   Reply With Quote
Old 2008-10-02, 10:37   #7
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

2·271 Posts
Default

Quote:
Originally Posted by Andi47 View Post
Can P95 version 25.6 use L3-cache? The Nehalem-processors will have 4 - 12 (24 for servers) MB shared L3, depending on the processor-type.
The L3 cache latency (number of cycles to get data from it to the CPU) is much higher than the L2 cache latency, so this might be an issue if you don't have prefetch instructions that help hide data movement between the various cache levels.

BTW that makes me wonder: are there processors that have different data prefetch instructions that target different memory hierarchy levels?
ldesnogu is offline   Reply With Quote
Old 2008-10-02, 11:47   #8
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2·3·13·83 Posts
Default

Quote:
Originally Posted by Prime95 View Post
No. The iteration speed is almost linear, but you need to run more iterations.
Or more accurately "the time per iteration is almost linear".
Furthermore the probability of finding a prime is inversely
proportional to the exponent.

"Madness" is apt ATM
davieddy is offline   Reply With Quote
Old 2008-10-02, 15:21   #9
philmoore
 
philmoore's Avatar
 
"Phil"
Sep 2002
Tracktown, U.S.A.

2×13×43 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The iteration speed is almost linear, but you need to run more iterations.
And since the number of iterations is equal to the exponent (minus 2), the total time to do a Lucas-Lehmer test is approximately proportional to the square of the exponent. Double the exponent, quadruple the run-time.
philmoore is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
To 100M to 75 Bits petrw1 GPU to 72 6 2017-04-17 04:03
100m p-1 and tf aurashift Software 18 2016-04-14 13:48
Treading LL P-1 for 100M TF Aramis Wyler GPU to 72 300 2014-01-20 04:12
Anyone working in 79.3M - 100M ?? markr Lone Mersenne Hunters 21 2008-12-21 16:02
March Madness 2006 Prime95 Lounge 20 2006-03-21 04:35

All times are UTC. The time now is 16:39.

Mon Jan 18 16:39:36 UTC 2021 up 46 days, 12:50, 0 users, load averages: 1.77, 1.66, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.