mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2017-05-10, 22:11   #1
dkemppai
 
May 2017
Michigan, USA

7 Posts
Default Prime95, V28.10 Build1 x64, Lockup Stopping workers

Hi,

Hate to post here, and I hope this is the right forum. Google seems to be failing me...

Prime 95 runs without errors, until I "Stop" all workers. Just finished a 44 Hour Stress test with Small FFT's, showing no failures in any 8 of the workers. As soon as I hit the "Stop" button to stop all threads, the system locked up hard. No BSOD, No warnings, just complete lockup. I've had this happen a number of times now.

Running Win10, 64bit, 16Gb,Ram, FX-9590, mild OC at 4800Mhz, Turbo turned off. LLC adjusted to keep core voltage at same level as no load. Temps were 20C below the thermal Limit. (49C, Thermal limit 70C)

During the run, I was able to open web pages, format drives, run system monitors, and use the PC as normal. No hint of anything unusual.

I've tried to google this, and am having trouble finding info on lockups stopping the workers. If any of you have any ideas what may be going on, please post some links or give me a few hints. I'm inclined to believe the build stable. However I'm not 100% sure it isn't hardware related.

If I can get this to repeat reliably, I may put a good scope on the board to check core voltage, etc. to see what the transients are on load step changes.

Also, I'm very sure Prime95 is the best stress test out there! :) So, It could very well be hardware related, I'm just not sure where to look... :)

Thanks,
Dan
dkemppai is offline   Reply With Quote
Old 2017-05-11, 02:36   #2
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

B2516 Posts
Default

If you run the CPU at stock clocks, does it still happen?
Mark Rose is offline   Reply With Quote
Old 2017-05-11, 03:05   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

3·2,383 Posts
Default

I had something similar happen on a Haswell overclocked system I built. I surmised that the CPU had problems backing down from the 0.1V voltage increase from prime95's use of AVX instructions and/or the voltage drop due to power-saving kicking in.

IIRC, either the problem went away when I disabled C states or I just ignored the problem since I simply run prime95 24/7.
Prime95 is offline   Reply With Quote
Old 2017-05-11, 12:38   #4
dkemppai
 
May 2017
Michigan, USA

7 Posts
Default Stability

Hi Mark,

So I believe it is stable at stock SPEEDS, but let me qualify. It is not stable at all stock settings, as the vdroop is a bit too much with default settings for the board/chip. A small amount of LLC was needed. Most of the time stopping all workers causes no problems. However it happens once in a while even at stock speeds (4.7Ghz).

I should clarify, the lockup doesn't happen all the time at 4.8Ghz, just once in a while stopping workers. Other than stopping workers, it's hasn't locked up.


Prime95,

Yeah, I'm guessing the VRM transients may be the root of the problem. Just 'feels' like a load dump issue on the VRM. I may put a scope on the VRM output to see what the transients are like. This system is a Frankenstein, and the FX-9590 labeled chips from AMD are a bit dubious anyway...

Might pull the clock back to 4.5 or 4.3 and try it there. If I can get it unconditionally stable I'll use it as a number crunching box or similar...


Thanks!
Dan
dkemppai is offline   Reply With Quote
Old 2017-05-11, 17:50   #5
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

32×317 Posts
Default

Have you tried a different power supply, too?
Mark Rose is offline   Reply With Quote
Old 2017-05-11, 19:10   #6
dkemppai
 
May 2017
Michigan, USA

716 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Have you tried a different power supply, too?
Yes.

Stock supply in the case was 350W. That dropped to 11.6V under load at the CPU VRM supply pins. Plopped a 750W supply in there to see if it helped. It did not. That drops to 11.95 under load at the CPU VRM supply pins.

No high end video (Just a K620 Quadro), one SSD, and two 1Gb platter drives is all that's in the system.


Dan
dkemppai is offline   Reply With Quote
Old 2017-05-11, 19:48   #7
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

32×317 Posts
Default

11.6 V is in spec, but not great.

That being said, I don't know what else to suggest.
Mark Rose is offline   Reply With Quote
Old 2017-05-12, 12:18   #8
dkemppai
 
May 2017
Michigan, USA

78 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
11.6 V is in spec, but not great.

That being said, I don't know what else to suggest.
Yeah, it's a little weird. I didn't really want to bother everyone on the forum, but wanted to make sure it wasn't a common P95 issue. At this point it sounds like it's not Prime95. Therefore it's probably a safe assumption it's a hardware issue...

I'm leaning towards a VRM issue on the main board. In laws were in town the last few days. So maybe this evening I'll throw my scope on the Vcore rail to see how it responds to transients. If I can find anything definitive, I'll be sure to post it.

Thanks!
Dan
dkemppai is offline   Reply With Quote
Old 2017-05-12, 20:39   #9
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

49816 Posts
Default

a AMD FX-9590 is a frankenstein monster, stock it is 220W and 4,7GHz. Many motherboards don't have the proper voltage regulation to cope with that kind of power...
Did you try disabling C-states in the BIOS?

Edit:
This thread belongs in the Q&A forum.

Last fiddled with by VictordeHolland on 2017-05-12 at 20:40
VictordeHolland is offline   Reply With Quote
Old 2017-05-12, 21:05   #10
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100100010101112 Posts
Default

Quote:
Originally Posted by dkemppai View Post
So maybe this evening I'll throw my scope on the Vcore rail to see how it responds to transients. If I can find anything definitive, I'll be sure to post it.
You have a scope? Cool.

I haven't played with one of those for many years. But a partner of mine had two, one of which was a multi-probe digital scope with a trigger and memory. It helped us figure out a subtle problem with an analogue circuit he had designed.
chalsall is offline   Reply With Quote
Old 2017-05-13, 03:25   #11
dkemppai
 
May 2017
Michigan, USA

7 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
a AMD FX-9590 is a frankenstein monster, stock it is 220W and 4,7GHz. Many motherboards don't have the proper voltage regulation to cope with that kind of power...
Did you try disabling C-states in the BIOS?

Edit:
This thread belongs in the Q&A forum.
All of the power saving features have been disabled in the bios.

Just tried to scope the Vcore voltage, and one of the inductors for the VRM had a bad solder joint on it! It looks like it never flowed, or flowed poorly and melted later. More testing to be done now that it's reflowed.

If this thread belongs elsewhere, and you have the ability please move it to it's correct location.

Thanks,
Dan
dkemppai is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 crashing on dual Opteron with some workers doing P-1 bgbeuning Information & Answers 2 2015-12-30 00:00
Prime95 - stop all workers on error [feature request] kql Software 0 2014-09-11 19:48
Went from 8 workers to 4 workers on v26.6 upgrade dmoran Software 13 2011-05-23 12:36
Stopping Prime95 from Attempting to Transmit Frivolous Result jinydu Lounge 15 2007-01-05 14:12
Server failure caused a lockup? ADBjester PrimeNet 6 2002-09-17 07:20

All times are UTC. The time now is 11:38.

Sun Sep 20 11:38:08 UTC 2020 up 10 days, 8:49, 0 users, load averages: 1.79, 1.42, 1.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.