mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   SkylakeX teasers (aka prime95 29.5) (https://www.mersenneforum.org/showthread.php?t=23723)

tServo 2019-01-30 16:26

[QUOTE=simon389;507158]I have four quad-channel AVX512 machines dedicated to Prime95 and all of them work fine in 29.4 but have random hardware errors on 29.5.

I have tried both 7820X and 9800X CPUs
I have tried two different kinds of quad channel 3600mhz RAM
I have tried both EVGA X299 Micro motherboards
I have invested in better coolers and kept temps below 70C
I have tried every build of 29.5 from 5-9
(Maybe a 400W platinum rated PSU isn’t enough?)

Hardware errors like 0.49 > 0.4 on all of them.

I’m rolling back to 29.4 until this hopefully gets sorted out someday. Kind of bummed because the optimizations really did make a big difference.[/QUOTE]

Yes, I would suspect the PSU. The 2 processors named can consume around 140-160 watts alone. Add the rest of the system and it could be close.

An easy test would be to run 29.4 with something consuming lots of power on your gpu
( what is it BTW ? ). I suggest mfaktc for Nvidia boards and Mfakto for AMD. They are quick and easy to get running. Another suggestion would be a benchmark test for your GPU. Some high end games have them built in.

This test is nice because you don't have to tear your system apart.

Another thing to try might be a benchmark or ram test.

Keep us posted & good luck.

ET_ 2019-01-30 17:21

I have a 9800X on a Asus X299 PRO and RAM @3600, plus a WD hard disk (1TB) , a Asus GT 710 and a Zotec RTX 2060, and decided to go for a CoolerMaster 850W gold PSU to be sure...

simon389 2019-01-30 19:56

It’s a 400W plat PSU for each CPU. So 4 PSUs. 29.4 at full load pulls 300W so I’m doubtful 29.5 means an additional 100W in load.

[QUOTE=tServo;507172]Yes, I would suspect the PSU. The 2 processors named can consume around 140-160 watts alone. Add the rest of the system and it could be close.

An easy test would be to run 29.4 with something consuming lots of power on your gpu
( what is it BTW ? ). I suggest mfaktc for Nvidia boards and Mfakto for AMD. They are quick and easy to get running. Another suggestion would be a benchmark test for your GPU. Some high end games have them built in.

This test is nice because you don't have to tear your system apart.

Another thing to try might be a benchmark or ram test.

Keep us posted & good luck.[/QUOTE]

simon389 2019-01-30 19:58

[QUOTE=ATH;507171]Have you tried any double checks in 29.4 to test if they are producing good results?

Did you watch CPU temperature when running 29.5 ? Those 70C was with 29.5?[/QUOTE]

I ran so many double checks that I’m almost positive about my conclusion. But I haven’t readded the bad double checks from 29.5 to 29.4 to see if they correct themselves. Fun idea.

The CPU temp at load on 29.5 was around 70C.

Prime95 2019-01-30 21:27

Some versions of 29.5 had overly aggressive FFT crossovers. This could lead to large roundoff errors.

I suggest starting again with version 29.5 build 9. There are no known issues with that build.

simon389 2019-01-30 21:57

[QUOTE=Prime95;507210]Some versions of 29.5 had overly aggressive FFT crossovers. This could lead to large roundoff errors.

I suggest starting again with version 29.5 build 9. There are no known issues with that build.[/QUOTE]

I’m not sure this is for me, but my testing included build 9. I’ll upgrade to a 850W PSU on one machine to see if that does the trick, but I find it unlikely considering I already have a 400W PSU and 29.4 only pulls 300W. Otherwise it seems to me that the AVX512 optimizations are not stable.

Mysticial 2019-01-30 22:23

[QUOTE=simon389;507217]I’m not sure this is for me, but my testing included build 9. I’ll upgrade to a 850W PSU on one machine to see if that does the trick, but I find it unlikely considering I already have a 400W PSU and 29.4 only pulls 300W. Otherwise it seems to me that the AVX512 optimizations are not stable.[/QUOTE]

How sure are you that your machine is actually AVX512-stable? There's a lot more to these chips than just power draw. Each workload type (non-AVX, AVX, AVX512) have different frequencies, voltages, impedences, etc...

If you've never run AVX512 before on it and it's now failing with Prime95 AVX512, then it doesn't rule out the hardware.

Even if you're not overclocked, you can still be AVX512 unstable. A lot of the mobo vendors messed up their BIOS settings to the point that they will crash under AVX512.

simon389 2019-01-31 05:53

[QUOTE=Mysticial;507220]How sure are you that your machine is actually AVX512-stable? There's a lot more to these chips than just power draw. Each workload type (non-AVX, AVX, AVX512) have different frequencies, voltages, impedences, etc...

If you've never run AVX512 before on it and it's now failing with Prime95 AVX512, then it doesn't rule out the hardware.

Even if you're not overclocked, you can still be AVX512 unstable. A lot of the mobo vendors messed up their BIOS settings to the point that they will crash under AVX512.[/QUOTE]

My mobo is not stable even at default settings. I am unsure how to do what you suggest. Is there another AVX512 stability test I could try?

tServo 2019-01-31 14:54

[QUOTE=simon389;507236]My mobo is not stable even at default settings. I am unsure how to do what you suggest. Is there another AVX512 stability test I could try?[/QUOTE]

AIDA64 has stress tests that will test avx512 ( along with everything else ).
You have to select 'stress FPU' to do that, I believe.

They have a free download ( I don't know how it differs from the pay version ) at [url]www.aida64.com[/url]

AIDA64 is considered to be an excellent utility with a good reputation.

PhilF 2019-02-01 02:56

[QUOTE=simon389;507217]I’m not sure this is for me, but my testing included build 9. I’ll upgrade to a 850W PSU on one machine to see if that does the trick, but I find it unlikely considering I already have a 400W PSU and 29.4 only pulls 300W. Otherwise it seems to me that the AVX512 optimizations are not stable.[/QUOTE]

Actually, you can't run a 400W power supply at 400W reliably.

How close you can get depends on the power supply's quality and how much they fudged on their rated specs, but I can easily see where a 400W could be reliable at 300W, but an additional 20 or 30 watts would cause the machine to become flaky. A newer version of Prime95 could indeed draw 20 or 30 watts more when using AVX512.

GP2 2019-02-01 04:37

[QUOTE=simon389;507053][url]https://www.mersenne.org/report_exponent/?exp_lo=51794089&full=1[/url][/QUOTE]

This exponent ([M]51794089[/M]) is currently assigned to someone else (probably an anonymous churner), but [M]47526449[/M] and [M]49834937[/M] are also mismatching results from your machines.

I am running a triple check on them now with 29.5 build 9 on an 8-core Skylake AVX-512 cloud machine (c5.4xlarge), and the log files are showing nothing but iteration counts so far.

Chances are it's your hardware setup not handling AVX-512, rather than the 29.5 code per se.


All times are UTC. The time now is 22:33.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.