mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-02-21, 15:13   #474
franz
 
Feb 2020

2×3 Posts
Default

Hi,

It's been suggested to me that I share here my experience running Prime95 (v29.8 build 6) on an AMD Threadripper 3970X.

The post is quite long so instead of copy/pasting it here, let me give a link to Level1Techs forum where you will find all the details:

https://forum.level1techs.com/t/amd-...tive-by-design

The conversation is also happening over at Hacker News, with some informative posts:

https://news.ycombinator.com/item?id=22382946

Hopefully you don't mind linking...

Thanks for the great work on Prime95: Over the past weeks I've been peeking at the code (being a C/C++ developer myself, but in the field of rendering) and I'm quite in awe at the level of optimization of this program.
franz is offline   Reply With Quote
Old 2020-02-22, 13:56   #475
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

22A716 Posts
Default

Hi there and welcome to the forum.
The "instant" part in your fail report, together with the fact that 8k AVX2 passes, points a lot towards a software bug. I am sure George didn't have a system with so many cores at hand when he wrote the benchmarks.

Hopefully he sees the thread soon, or you can PM him on this forum (user Prime95) and point him to the thread. He is generally a busy person, but it will take the time to look into it as soon as possible.

Last fiddled with by LaurV on 2020-02-22 at 13:58
LaurV is offline   Reply With Quote
Old 2020-02-22, 15:32   #476
axn
 
axn's Avatar
 
Jun 2003

5·23·41 Posts
Default

Quote:
Originally Posted by LaurV View Post
The "instant" part in your fail report, together with the fact that 8k AVX2 passes, points a lot towards a software bug.
Doubtful, but that's just my personal opinion.
axn is offline   Reply With Quote
Old 2020-02-23, 08:44   #477
franz
 
Feb 2020

1102 Posts
Default

I also doubt it.

We are currently investigating together with AMD what's really going on. The current thinking is that this may be an issue with how the VRMs (part of the power delivery network on the motherboard) are programmed to react to low-to-high and high-to-low load transitions. We will keep you posted here.

That said, speaking for myself, I would love to hear from George on this topic that he surely knows a lot about.
franz is offline   Reply With Quote
Old 2020-02-24, 03:34   #478
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

157428 Posts
Default

Quote:
Originally Posted by franz View Post
That said, speaking for myself, I would love to hear from George on this topic that he surely knows a lot about.
I know less than you think.

While I'm not saying it is impossible for this to be a prime95 bug, each torture test worker is running the same code. That the 16K FFT works on Intel chips, when fewer workers are running, and on some Zen systems, strongly indicates the problem is not in prime95.

Prime95 on my first Haswell systems had a similar problem. The voltage changes were not handled well causing a crash. I worked around the issue by disabling "C states" so that the chip didn't try to drop the voltage nearly as much.

It is puzzling that the crash only happens with the 16K FFT, but I think the AMD engineers are better placed in figuring out why.
Prime95 is offline   Reply With Quote
Old 2020-02-24, 07:26   #479
franz
 
Feb 2020

68 Posts
Default

Thanks for chiming in George.

Indeed, Prime95 with 16K FFTs passes even on my 3970X if I use fewer worker threads:
it fails instantly with 64, after a second with 62, after a few seconds with 58, and it's stable for at least 15 minutes with only 32, for instance.

Interesting to hear about C-states. Not sure it's a realistic option on the 3970X since the TDP is so high that one will certainly want to save some electricity at idle.
franz is offline   Reply With Quote
Old 2020-02-24, 08:44   #480
franz
 
Feb 2020

2×3 Posts
Default

Out of curiosity, I tried disabling C-states but that didn't solve the issue, although P95/FFT16K seems to only fail after 1-3 seconds instead of within a fraction of a second.
franz is offline   Reply With Quote
Old 2020-02-25, 04:39   #481
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1BE216 Posts
Default

Quote:
Originally Posted by JuanTutors View Post
I'm getting the following error (see image) when I run ECM2 on M332XXXXXX..
The GMP library is running out of memory in mpz_gcdext. Catching allocation errors in GMP is non-trivial.
Prime95 is offline   Reply With Quote
Old 2020-03-02, 14:31   #482
HLB
 
Dec 2017

22 Posts
Default

I apologize if here is not the most correct place for my question, but why in version 29.8 build 5 my 9900k only passes with 1.38v and in version 29.8 build 6 it passes with 1.36v, any technical explanation or correction in the new build?
HLB is offline   Reply With Quote
Old 2020-03-02, 18:58   #483
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·43·83 Posts
Default

Quote:
Originally Posted by HLB View Post
I apologize if here is not the most correct place for my question, but why in version 29.8 build 5 my 9900k only passes with 1.38v and in version 29.8 build 6 it passes with 1.36v, any technical explanation or correction in the new build?
No changes in prime95 to explain that.
Prime95 is offline   Reply With Quote
Old 2020-03-04, 18:02   #484
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×4,643 Posts
Default

George... Thanks a lot for the new ExitWhenOutOfWork functionality. It is working out perfectly for my use-case.

One minor change request, please... When mprime exits gracefully, could it please delete its PID file?

Thanks.

Edit: I just observed an mprime instance finish its assignment, and exit cleanly. It removed its PID file. Perhaps it's left around when SIGINT'ed? Or perhaps I'm insane (non-zero probability)...

Last fiddled with by chalsall on 2020-03-04 at 18:33
chalsall is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 version 29.2 Prime95 Software 71 2017-09-16 16:55
Prime95 version 29.1 Prime95 Software 95 2017-08-22 22:46
Prime95 version 26.5 Prime95 Software 175 2011-04-04 22:35
Prime95 version 25.9 Prime95 Software 143 2010-01-05 22:53
Prime95 version 25.8 Prime95 Software 159 2009-09-21 16:30

All times are UTC. The time now is 12:48.

Tue Oct 27 12:48:02 UTC 2020 up 47 days, 9:59, 0 users, load averages: 3.00, 2.60, 2.13

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.