![]() |
Issue with Broadwell-E and mprime?
Hi,
I stress test in linux using mprime and have been OCing a i7-6950X (new 10core extreme cpu) I started encountering an issue that I never had before.. basically while stress testing mprime will kill itself after about 15minutes. There will be no errors posted in results.txt nor anywhere else. All load just drops on the CPUs and system stays up with zero errors, zero temperature issues, etc. Currently using mprime v27.9. Is there any known bug with the new broadwell-e cpus? I haven ot experienced this issue on anything ranging from 4.0GHz to 4.5GHz, but while I am pushing much higher this keeps happening. |
[QUOTE=Akujik;439273]Is there any known bug with the new broadwell-e cpus? I haven ot experienced this issue on anything ranging from 4.0GHz to 4.5GHz, but while I am pushing much higher this keeps happening.[/QUOTE]
There is your answer. Keep it clocked to 4.5 GHz or under. It's not stable higher. |
[QUOTE=Mark Rose;439280]There is your answer. Keep it clocked to 4.5 GHz or under. It's not stable higher.[/QUOTE]
How is that an answer? I have it stable at 4.7. I just cannot get cache at default or higher - these issues will always happen. - Note, I said these issues happen over 4.5 and havent before.. I did not say I didn't have a system stable over those speeds. And I don't see how instability is proven by the program just stopping rather than reporting an error. To have my answer it needs to actually answer my questions - Segmentation Fault and why the program will sometimes just stop rather than reporting errors. With segmentation fault it seems like this was a bug before as there was an official reply talking about fixing the bug regarding segmentation fault before.. but my version shouldn't have this bug according to that post, and not always having to do with instability issues. |
[QUOTE=Akujik;439286]How is that an answer?
I have it stable at 4.7. I just cannot get cache at default or higher - these issues will always happen. - Note, I said these issues happen over 4.5 and havent before.. I did not say I didn't have a system stable over those speeds. And I don't see how instability is proven by the program just stopping rather than reporting an error. To have my answer it needs to actually answer my questions - Segmentation Fault and why the program will sometimes just stop rather than reporting errors. With segmentation fault it seems like this was a bug before as there was an official reply talking about fixing the bug regarding segmentation fault before.. but my version shouldn't have this bug according to that post, and not always having to do with instability issues.[/QUOTE] If you run mprime in GDB / LLDB then you should actually get useful information when it crashes, such as what it was doing. Most likely Mark is right - your overclock reached a stability point where a calculation was wrong or a bit got flipped and the program reached what should have been an impossible state. mprime isn't exactly high-security defensively programmed software, it is optimized for performance. As a result it likely makes assumptions about the output of an operation based on the input (trivial example - adding two 8 bit numbers and storing the result in a 32 bit word should always leave the upper 16 bits as zero - but if an OC becomes unstable that register might not be 0, and a later operation depending on that state generates a seg fault.) |
Less technically: If a program crashes under certain overclocking conditions but does not crash without the overclock, the problem is with the hardware, not the software.
mprime is not the only program I've had just disappear (silently crash) when my OC is too aggressive. It may be the only program that you've tested that does so, but all the same it is a sign you're too close to the edge for real stability. There are uses (e.g. gaming) where this state is acceptable, but for scientific computation you should back off the OC settings. |
[QUOTE=Akujik;439286]I have it stable at 4.7.[/QUOTE]
Clearly, you don't. |
1 Attachment(s)
[QUOTE=Akujik;439286]How is that an answer?
I have it stable at 4.7. [/QUOTE] So... you can rev your RPM to 11,000 while standing still, but not while driving? Maybe you should have not skipped those physics classes in high school? Just maybe... |
@OP: Together with P95 distribution, it comes a file called "stress.txt". Read it.
[QUOTE="stress.txt", last FAQ] Q) A forum member said "Don't bother with prime95, it always pukes on me, and my system is stable!. What do you make of that?" or "We had a server at work that ran for 2 MONTHS straight, without a reboot I installed Prime95 on it and ran it - a couple minutes later I get an error. You are going to tell me that the server wasn't stable?" A) These users obviously do not subscribe to the 100% rock solid school of thought. [COLOR=Red][B]THEIR MACHINES DO HAVE HARDWARE PROBLEMS.[/B][/COLOR] But since they are not presently running any programs that reveal the hardware problem, the machines are quite stable. As long as these machines never run a program that uncovers the hardware problem, then the machines will continue to be stable. [/QUOTE] |
[QUOTE=Akujik;439273]Currently using mprime v27.9.[/QUOTE]
Try upgrading to v28.9 (the latest) and see if that works for you. It works slightly differently internally (with potentially multiple helper threads for each worker thread), so who knows, maybe there's a slight possibility that the problem won't be triggered. |
[QUOTE=chalsall;439292]Clearly, you don't.[/QUOTE]
30 hours on mprime with 47/25 isn't good? I clearly don't with a higher cache though since that is what I am explaining and wanted to know about those 2 specific issues/failures, you are right about that. |
[QUOTE=GP2;439313]Try upgrading to v28.9 (the latest) and see if that works for you.
It works slightly differently internally (with potentially multiple helper threads for each worker thread), so who knows, maybe there's a slight possibility that the problem won't be triggered.[/QUOTE] thanks, will give it a try |
| All times are UTC. The time now is 04:42. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.