![]() |
Hi,
It's been suggested to me that I share here my experience running Prime95 (v29.8 build 6) on an AMD Threadripper 3970X. The post is quite long so instead of copy/pasting it here, let me give a link to Level1Techs forum where you will find all the details: [url]https://forum.level1techs.com/t/amd-threadripper-3970x-under-heavy-avx2-load-defective-by-design[/url] The conversation is also happening over at Hacker News, with some informative posts: [url]https://news.ycombinator.com/item?id=22382946[/url] Hopefully you don't mind linking... Thanks for the great work on Prime95: Over the past weeks I've been peeking at the code (being a C/C++ developer myself, but in the field of rendering) and I'm quite in awe at the level of optimization of this program. |
Hi there and welcome to the forum.
The "instant" part in your fail report, together with the fact that 8k AVX2 passes, points a lot towards a software bug. I am sure George didn't have a system with so many cores at hand when he wrote the benchmarks. Hopefully he sees the thread soon, or you can PM him on this forum (user Prime95) and point him to the thread. He is generally a busy person, but it will take the time to look into it as soon as possible. |
[QUOTE=LaurV;538142]The "instant" part in your fail report, together with the fact that 8k AVX2 passes, points a lot towards a software bug.[/QUOTE]
Doubtful, but that's just my personal opinion. |
I also doubt it.
We are currently investigating together with AMD what's really going on. The current thinking is that this may be an issue with how the VRMs (part of the power delivery network on the motherboard) are programmed to react to low-to-high and high-to-low load transitions. We will keep you posted here. That said, speaking for myself, I would love to hear from George on this topic that he surely knows a lot about. |
[QUOTE=franz;538174]
That said, speaking for myself, I would love to hear from George on this topic that he surely knows a lot about.[/QUOTE] I know less than you think. While I'm not saying it is impossible for this to be a prime95 bug, each torture test worker is running the same code. That the 16K FFT works on Intel chips, when fewer workers are running, and on some Zen systems, strongly indicates the problem is not in prime95. Prime95 on my first Haswell systems had a similar problem. The voltage changes were not handled well causing a crash. I worked around the issue by disabling "C states" so that the chip didn't try to drop the voltage nearly as much. It is puzzling that the crash only happens with the 16K FFT, but I think the AMD engineers are better placed in figuring out why. |
Thanks for chiming in George.
Indeed, Prime95 with 16K FFTs passes even on my 3970X if I use fewer worker threads: it fails instantly with 64, after a second with 62, after a few seconds with 58, and it's stable for at least 15 minutes with only 32, for instance. Interesting to hear about C-states. Not sure it's a realistic option on the 3970X since the TDP is so high that one will certainly want to save some electricity at idle. |
Out of curiosity, I tried disabling C-states but that didn't solve the issue, although P95/FFT16K seems to only fail after 1-3 seconds instead of within a fraction of a second.
|
[QUOTE=JuanTutors;537349]I'm getting the following error (see image) when I run ECM2 on M332XXXXXX..[/QUOTE]
The GMP library is running out of memory in mpz_gcdext. Catching allocation errors in GMP is non-trivial. |
I apologize if here is not the most correct place for my question, but why in version 29.8 build 5 my 9900k only passes with 1.38v and in version 29.8 build 6 it passes with 1.36v, any technical explanation or correction in the new build?
|
[QUOTE=HLB;538729]I apologize if here is not the most correct place for my question, but why in version 29.8 build 5 my 9900k only passes with 1.38v and in version 29.8 build 6 it passes with 1.36v, any technical explanation or correction in the new build?[/QUOTE]
No changes in prime95 to explain that. |
George... Thanks a lot for the new ExitWhenOutOfWork functionality. It is working out perfectly for my use-case.
One minor change request, please... When mprime exits gracefully, could it please delete its PID file? Thanks. Edit: I just observed an mprime instance finish its assignment, and exit cleanly. It removed its PID file. Perhaps it's left around when SIGINT'ed? Or perhaps I'm insane (non-zero probability)... |
| All times are UTC. The time now is 22:08. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.