mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2019-02-10, 15:50   #342
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

735710 Posts
Default

Quote:
Originally Posted by ET_ View Post
Is version 29.5.10 able to recover from such situation?
What should I do with the *.bad savefiles?
There is nothing new for you in build 10.

You can try to exit prime95, rename one of the .bad biles to .bu, restart prime95. There is the tiniest chance that the save files were good and the program or OS was in a funky state. I've seen it happen in the rarest of occasions.

Otherwise, delete the save files. Run a torture test. Restart from scratch. Increasing the frequency of Jacobi checks might be prudent until you are sure the system is stable.
Prime95 is offline   Reply With Quote
Old 2019-02-10, 15:57   #343
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

13×31 Posts
Default

Just tried using 29.5b10 for some benching.

First an observation, it doesn't seem to remember the number of workers in the benching settings window, where other settings are remembered. On 8086k and 7800X systems, it would always go back to 1, 6 when I return to the window, having previously entered only 6 in there. On my R5 2600 system, it reverts to 1, 2, 6, even though it is 6 core 12 thread like the Intel CPUs. I used a fresh download of P95 so I don't think it should be affected by any past testing with older versions.

I also tried using it to see how different instructions affect performance by editing local.txt according to undoc.txt. In short, I see 4 performance levels, let's call them AVX512F, FMA3, AVX, SSE/SSE2/SSE4. Is AVX2, distinct from FMA3, used at all? Disabling AVX2 in isolation doesn't seem to make any difference.

I think in older versions the output would show which instruction or FFT type (not size) was in use. I don't see it any more, has it been removed?
mackerel is offline   Reply With Quote
Old 2019-02-10, 17:33   #344
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7×1,051 Posts
Default

Quote:
Originally Posted by mackerel View Post
First an observation, it doesn't seem to remember the number of workers in the benching settings window, where other settings are remembered. On 8086k and 7800X systems, it would always go back to 1, 6 when I return to the window, having previously entered only 6 in there. On my R5 2600 system, it reverts to 1, 2, 6, even though it is 6 core 12 thread like the Intel CPUs.
Yeah, it defaults to the #workers a newbie should try to see what #workers gives the best throughput. I'm not sure I like that feature either.

12 is not one of the defaults because using hyperthreading is often a poor choice.

Quote:
Is AVX2, distinct from FMA3, used at all? Disabling AVX2 in isolation doesn't seem to make any difference.
The integer instructions introduced by AVX2 are used in trial factoring. You are correct the AVX2 flag makes no difference in the FFTs.

Quote:
I think in older versions the output would show which instruction or FFT type (not size) was in use. I don't see it any more, has it been removed?
Look in results.bench.txt. I don't think the FFT type has ever been displayed on screen.
Prime95 is offline   Reply With Quote
Old 2019-02-10, 17:49   #345
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

11168 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Look in results.bench.txt. I don't think the FFT type has ever been displayed on screen.
On my i7-4790, running Prime95 V29.4 build 8, the screen output shows it is using "FMA3 FFT length 4800K". It's running a PRP test, not a LL test.

Last fiddled with by PhilF on 2019-02-10 at 17:51
PhilF is online now   Reply With Quote
Old 2019-02-10, 20:41   #346
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7·1,051 Posts
Default

Quote:
Originally Posted by PhilF View Post
On my i7-4790, running Prime95 V29.4 build 8, the screen output shows it is using "FMA3 FFT length 4800K". It's running a PRP test, not a LL test.
Clarification: I don't think it has ever shown the "FMA3" on screen during a benchmark.
Prime95 is offline   Reply With Quote
Old 2019-02-11, 04:59   #347
simon389
 
Aug 2013

1278 Posts
Default

79207379 PRP DC on build 10 on the still “broken” system completed successfully!

Next up is 79210559.
simon389 is offline   Reply With Quote
Old 2019-02-12, 02:38   #348
simon389
 
Aug 2013

5716 Posts
Default

I'm still having major hardware troubles with the 29.5b10 release. I was able to successfully get it stable with AVX512 tests on AIDA64 (all I needed to do was keep the RAM below 3600 Mhz), but when I moved over to testing on Prime95, I ran into major problems.



29.4 still works great on all 3 9800X machines, but 29.5b10 cannot get a stable LL doublecheck, no matter what settings I use.

I've set the CPU from 3.5 Ghz to 4.1 Ghz - all have hardware errors in LL DC. I've set the RAM from its rated 3600Mhz speed to motherboard default 2000Mhz (as well as slowed the cas latency and other RAM timings, even at 2000Mhz). As a last resort I tried a severely underclocked 3.6 Ghz with increased CPU voltage (1.15v) and 2000Mhz RAM with slower CL19 timings. All on a 850W PSU, Mhz/voltages monitored with CPU-Z and temps monitored. And this failed too.



So I'm at a loss for what to try next, and feel this 29.5 release is still going to be problematic for some AVX512 CPUs. Unless I'm missing some setting to try.

Feel free to suggest I move to another thread, because perhaps my systems are just unique.
simon389 is offline   Reply With Quote
Old 2019-02-12, 05:53   #349
simon389
 
Aug 2013

5716 Posts
Default

So this is peculiar. I am three hours into a blend torture test and everything looks great (perhaps temps get a bit high at times - 85C). But if I try to run a LL DC it very quickly errors out with "Invalid FFT, restarting from save file". So the AVX512 torture test seems like it's working but the actual LL DC seems like its erroring. Gives me hope that maybe the problem is the beta code and not my hardware.

Last fiddled with by simon389 on 2019-02-12 at 05:57
simon389 is offline   Reply With Quote
Old 2019-02-12, 06:42   #350
GP2
 
GP2's Avatar
 
Sep 2003

32·7·41 Posts
Default

Quote:
Originally Posted by simon389 View Post
Gives me hope that maybe the problem is the beta code and not my hardware.
But others are using the same code and getting matching LL double-checks and PRP double-checks.

In my case I'm using the Linux version so it's not quite the same, but surely many have used the Windows versions of the various 29.5 builds.

Maybe you could try running the Linux 64 version under WSL on the very same machines, and see what happens.

Or double-check prime.txt and local.txt to see if there's some unusual setting in there that no one else uses.

Last fiddled with by GP2 on 2019-02-12 at 07:04
GP2 is offline   Reply With Quote
Old 2019-02-12, 07:38   #351
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

111038 Posts
Default

Quote:
Originally Posted by simon389 View Post
So this is peculiar. I am three hours into a blend torture test and everything looks great (perhaps temps get a bit high at times - 85C). But if I try to run a LL DC it very quickly errors out with "Invalid FFT, restarting from save file". So the AVX512 torture test seems like it's working but the actual LL DC seems like its erroring. Gives me hope that maybe the problem is the beta code and not my hardware.
Have you tried moving the save file out of the folder and starting a test from scratch now that the hardware is more stable? Perhaps your save file is part of the problem? (Says the guy who doesn't know quite how the Gerbicz error check works, but has paid enough attention to have read that your machine somehow had an error despite the check)
VBCurtis is offline   Reply With Quote
Old 2019-02-12, 17:31   #352
simon389
 
Aug 2013

3·29 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Have you tried moving the save file out of the folder and starting a test from scratch now that the hardware is more stable? Perhaps your save file is part of the problem? (Says the guy who doesn't know quite how the Gerbicz error check works, but has paid enough attention to have read that your machine somehow had an error despite the check)

Yes, I've tried starting from scratch many times. My Prime95 stress test (blend) is now going on 11 hours with no errors, but within 5 minutes of starting a LL DC it still errors Invalid FFT. So strange...


Quote:
Originally Posted by GP2 View Post
But others are using the same code and getting matching LL double-checks and PRP double-checks.

In my case I'm using the Linux version so it's not quite the same, but surely many have used the Windows versions of the various 29.5 builds.

Maybe you could try running the Linux 64 version under WSL on the very same machines, and see what happens.

Or double-check prime.txt and local.txt to see if there's some unusual setting in there that no one else uses.

I've always used brand new installations of build 10, so I'm unsure how prime.txt and local.txt could have bad settings. I will install Linux, run Prime95 via Linux, and see if that helps.

Last fiddled with by simon389 on 2019-02-12 at 17:34
simon389 is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 02:01.

Fri Mar 5 02:01:44 UTC 2021 up 91 days, 22:13, 0 users, load averages: 1.44, 1.65, 1.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.