mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2010-06-11, 13:08   #1
esqrkim
 
Mar 2010
California, USA

110102 Posts
Default Sudden sumout error

I have successfully completed 6 LL tests on a Quad processor when suddenly the computer crashes due to sumout error after completing about 75% of a LL test on one of the cores. For several months, I have been running LL tests 24/7 on all four cores without a single problem until last night. Now the computer has a problem loading Windows. What could be the problem?

Thanks for your help.
esqrkim is offline   Reply With Quote
Old 2010-06-11, 14:15   #2
Rhyled
 
Rhyled's Avatar
 
May 2010

6310 Posts
Default Temperatures?

SUMOUT errors are usually hardware failures from what I've seen. Since your system has been stable for months, I'd look for something to have changed - perhaps dust build up in the heatsink or a fan failure?

If you don't have another option, and Windows can still run, try RealTemp to check your cpu temperatures and make sure they're not skyrocketing. http://www.techpowerup.com/realtemp/
Rhyled is offline   Reply With Quote
Old 2010-06-11, 20:12   #3
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego County

12528 Posts
Default

IME, the most common causes of sudden Prime95 failure after an extended period of faultless operation are:
0. Unstable overclock of CPU or RAM.
1. Heat due to clogged/stopped CPU HSF.
2. Memory failure. Use Memtest86+ to verify.
3. Video driver bugs (primarily NVDIA); this has become much less likely than in years past, and is less likely than the first three.
4. Mainboard failure.
5. CPU failure.

Last fiddled with by sdbardwick on 2010-06-11 at 20:14 Reason: added 0 and 5
sdbardwick is offline   Reply With Quote
Old 2010-06-12, 22:46   #4
esqrkim
 
Mar 2010
California, USA

2·13 Posts
Default

Well, I turned on the computer this morning and ran torture test for about 8 hours without overclocking. There was no problem.

The core that gave the sumout error had completed about 75% of the iteration for M50315939. The strange thing is that it lost all previous results for M50315939 and has started from zero. Is this normal when a sumout error occurs?

I am thinking that there might have been some software conflict. I allowed the system to perform some Windows related updating the night before. I don't think it ever prompted me for a restart. Has anyone experienced sumout error that was related to software issues?
esqrkim is offline   Reply With Quote
Old 2010-06-12, 23:00   #5
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

769210 Posts
Default

Quote:
Originally Posted by esqrkim View Post
The core that gave the sumout error had completed about 75% of the iteration for M50315939. The strange thing is that it lost all previous results for M50315939 and has started from zero. Is this normal when a sumout error occurs?
No, that's unusual. There should have been a save (checkpoint) file, from which it could have continued.

Quote:
I am thinking that there might have been some software conflict. I allowed the system to perform some Windows related updating the night before. I don't think it ever prompted me for a restart. Has anyone experienced sumout error that was related to software issues?
Yes, I have.
cheesehead is offline   Reply With Quote
Old 2010-06-13, 01:29   #6
Rhyled
 
Rhyled's Avatar
 
May 2010

32·7 Posts
Default Curiouser and curiouser...

When a SUMOUT error occurs, Prime95 is supposed to return to the previous checkpoint file. That should only cost you 30 minutes or so of lost effort.

Here's a bizzare thought. Check your anti-virus log and see if it quarantined your checkpoint file. There should be 3 versions of your backup files (mine are listed as p3J15893 p3J15893.bu and p3J15893.bu2) in your Prime95 folder. I guess it's theoretically possible that the semi random nature of those backup files matched one of your virus signatures and got yanked.

I really need to get my avatar uploaded.
Rhyled is offline   Reply With Quote
Old 2010-06-13, 03:44   #7
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

Quote:
Originally Posted by esqrkim View Post
The strange thing is that it lost all previous results for M50315939 and has started from zero. Is this normal when a sumout error occurs?
Other things to check: how often is your save file written (default = 30 minutes). Now that you've started again, do save files exist for the current run? Is the folder to which the save files are written write-protected, so that the program does not have the authority to write there?
cheesehead is offline   Reply With Quote
Old 2010-06-13, 17:18   #8
esqrkim
 
Mar 2010
California, USA

2·13 Posts
Default

Well, Windows prompted me to update my computer again. I performed the update and restarted the computer. To make a long story short, Prime 95 crashed again. The computer did an auto reboot. I ran the torture test again to see what would happen since the torture test ran for many hours before. This time all four cores has round off errors. I tried to restore the computer to a point prior to Windows update. Restoring operation was successful, but apparently some files got messed up. My next step is to reformat the HD, reinstall WinXP, and see if all the hardwares are functioning correctly. Then I'll run torture test again.

Thanks for all your inputs.
esqrkim is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
ERROR: ILLEGAL SUMOUT mush777 Software 13 2007-04-02 17:26
Need some hardware help and no it is not a SUMOUT error garo Hardware 6 2004-12-12 15:06
I got sumout error after few min of testing striker Software 5 2004-09-05 04:18
SUMOUT error Lieven Hardware 9 2004-01-03 00:38
Error: ILLEGAL SUMOUT weatherboy Software 12 2003-03-11 06:20

All times are UTC. The time now is 19:56.

Thu May 6 19:56:13 UTC 2021 up 28 days, 14:37, 0 users, load averages: 1.82, 1.97, 2.03

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.