mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Should we continue to crunch after an error occurs?? (https://www.mersenneforum.org/showthread.php?t=360)

outlnder 2003-02-11 02:19

Should we continue to crunch after an error occurs??
 
I noticed that after 5 days of work on a 33M exponent, that I had 2 errors within 3 hours of each other. The system had never reported errors before, on this exponent or any others.

Should I continue with the test or throw it out and start again?

Prime95 2003-02-11 02:33

Tell us what the error message was. ILLEGAL SUMOUT is benign. Some ROUND OFF > 0.40 errors are harmless if you are near an FFT limit and it is reproducible.

If not one of these cases, I'd toss the 5 days of work.

outlnder 2003-02-11 07:06

I guess it is a bit longer than 5 days. I started the exponent on Jan. 25th.

This is a P4 doing the crunching. Here is the statement from the results file.

[quote][Sun Feb 09 05:36:18 2003]
Iteration: 13894912/33322867, ERROR: ROUND OFF (0.5) >0.40
Continuing from last save file.
[Sun Feb 09 06:20:16 2003]
Iteration: 13905664/33322867, ERROR: ROUND OFF (0.5) >0.40
Continuing from last save file. [/quote]

eepiccolo 2003-02-11 13:06

I recently got an ILLEGAL SUMOUT error after my computer crashed. Does it make sense that I would get an ILLEGAL SUMOUT error right after my computer crashed then re-booted?

Prime95 2003-02-11 13:57

[quote="outlnder"]
Iteration: 13894912/33322867, ERROR: ROUND OFF (0.5) >0.40
Continuing from last save file.
[Sun Feb 09 06:20:16 2003]
Iteration: 13905664/33322867, ERROR: ROUND OFF (0.5) >0.40
Continuing from last save file. [/quote]

A serious error. I'd say the chances at best 50% that your result is OK. Prime95 does not catch every error. It's your call as to weather you should finish it off - you are close to 50% complete.

By the way, have you found the cause of the error? If not, you may as well finish this exponent until you can figure out what is causing the problem.

Prime95 2003-02-11 13:59

[quote="eepiccolo"]I recently got an ILLEGAL SUMOUT error after my computer crashed. Does it make sense that I would get an ILLEGAL SUMOUT error right after my computer crashed then re-booted?[/quote]

You could have a driver that isn't saving the FPU state properly during the reboot. ILLEGAL SUMOUTs are fairly innocuous, so I wouldn't worry about it.

cheesehead 2003-02-11 18:36

[quote="eepiccolo"]I recently got an ILLEGAL SUMOUT error after my computer crashed. Does it make sense that I would get an ILLEGAL SUMOUT error right after my computer crashed then re-booted?[/quote]
Yeah -- when a graphics driver used to lock up W98 hard, I got those right after reboot. Never caused a bad L-L residue though.

outlnder 2003-02-11 21:40

[quote]By the way, have you found the cause of the error? If not, you may as well finish this exponent until you can figure out what is causing the problem.[/quote]

This is a little difficult to find. This machine has been running well over a month and has never given an error before. And it has been running continuously since the 2 errors and has not given another.

All I can figure is it got bad power for a little while.

If anyone has any theories, I would be glad to entertain them.

apocalypse 2003-02-12 04:19

Other errors on startup
 
I've noticed that the past 2 or 3 times I rebooted my current system (AthlonXP, WindowsXP), I got a couple of SUM(INPUTS) != SUM(OUTPUTS) within the first few iterations, and then nothing thereafter. Is this likely to be a benign problem, or should I toss my work and start over?

Thanks for the help.

Hades_au 2003-02-12 10:13

I just experienced a blackout and a machine went down. In rebooting I've stuffed the overclock and started receiving errors, so I restarted the exponent.

It was 93% complete :(


All times are UTC. The time now is 16:11.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.