 2020-04-03, 06:43 #12 phillipsjk   Nov 2019 1C16 Posts Update It appears that the problem was indeed the exponents getting too large for the L2 caches on my dual L5420 system. The 105XXXxxx exponents had an expected run-time of about 16 months. I tried an 84XXXxxx exponent (about 5MB FFT, small enough to fit in the 6MB cache), and the expected run-time is only about 4 months (double the 2 month prediction). That is with the sister cores using all of the memory bandwidth with stage 2 P-1 factoring. My Xeon X3430 system, with 8MB of L3 cache, seems to process 105XXXxxx exponents slightly faster than the software predicts (possibly due to under-clocking effects not being linear). Last fiddled with by phillipsjk on 2020-04-03 at 06:46
#13
axn

Jun 2003

3·1,531 Posts

Quote:
 Originally Posted by phillipsjk I tried an 84XXXxxx exponent (about 5MB FFT, small enough to fit in the 6MB cache), and the expected run-time is only about 4 months (double the 2 month prediction).
A 5M FFT takes up 40MB which wouldn't fit in your 6MB cache.

#14
phillipsjk

Nov 2019

22·7 Posts

Quote:
 Originally Posted by axn A 5M FFT takes up 40MB which wouldn't fit in your 6MB cache.

What does the FFT size refer to then?

All I know is that I observed a drastic drop in performance.

Edit2: That does explain why work that "supposedly" fits in the cache was slowed down by what the other two cores were doing.

 2020-04-04, 02:34 #15 axn     Jun 2003 3·1,531 Posts Yes, running multiple workers on that CPU will trash the cache. You're probably better off running a configuration of two workers, each using 4 threads (i.e. one worker per CPU). Couple of things. 1) Ark says that L5420 is a 12MB part. https://ark.intel.com/content/www/us...3-mhz-fsb.html . Is this what mprime detects as well? 2) Those CPUs are really old. It is probably not worth keeping them running. But if you're going to keep them running, probably double checks (with much smaller FFT) might be the most efficient.
 2020-04-04, 03:25 #16 phillipsjk   Nov 2019 22·7 Posts Code: [Main thread Apr 3 00:18] Mersenne number primality test program version 29.8 [Main thread Apr 3 00:18] Optimizing for CPU architecture: Core 2, L2 cache size: 4x6 MB The processor is basically 2 Core2Dous stuck together. The machine is my primary data-store, and runs 24/7 anyway. I suspect a newer machine won't have a lower power draw; it will just be more efficient when I do have work for it. The current non-mprime workload barely loads 2 cores. Those CPUs are actually an upgrade from the previous ones installed in the board. I now have twice as many cores, higher clock speed, and lower power draw (~50W, instead of 100W per CPU) Last fiddled with by phillipsjk on 2020-04-04 at 03:29

