mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2011-04-16, 11:14   #1
pjaj
 
pjaj's Avatar
 
Oct 2010

3×5 Posts
Default Prime95 roundoff errors

I just re-started an old primality test after completing another and Prime 95 is throwing up round off errors:-

"Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4"

The other 3 cores of my i7 are running other tests without errors.

Should I be concerned?
Should I do anything about it?
pjaj is offline   Reply With Quote
Old 2011-04-16, 13:37   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11011000010002 Posts
Default

It is not "throwing up round off errors". It is telling you that you have had one during the course of the LL test. This could be quite normal. Look in results.txt for the actual error message. If the roundoff error was barely over 0.4 (like 0.40625 or .4375, not 0.4997) and the error was reproducible then this is a non-issue.
Prime95 is online now   Reply With Quote
Old 2011-04-16, 16:30   #3
pjaj
 
pjaj's Avatar
 
Oct 2010

3·5 Posts
Default

Quote:
Originally Posted by Prime95 View Post
It is not "throwing up round off errors". It is telling you that you have had one during the course of the LL test. This could be quite normal. Look in results.txt for the actual error message. If the roundoff error was barely over 0.4 (like 0.40625 or .4375, not 0.4997) and the error was reproducible then this is a non-issue.
A figure of speech.
Actually it's generating the same error report for each successive set of 10,000 iterations. It's not one isolated incident.
results.txt only contains a single error report
"Iteration: 25227368/48995293, ERROR: ROUND OFF (0.5) > 0.40"
but the screen for that worker reports a new one every 7-8 minutes or so.


[Apr 16 17:16] Waiting 10 seconds to stagger worker starts.
[Apr 16 17:16] Worker starting
[Apr 16 17:16] Setting affinity to run worker on logical CPUs 4,5
[Apr 16 17:16] Resuming primality test of M48995293 using Core2 type-3 FFT length 2560K, Pass1=640, Pass2=4K
[Apr 16 17:16] Iteration: 25336590 / 48995293 [51.71%].
[Apr 16 17:16] Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4.
[Apr 16 17:16] Confidence in final result is fair.
[Apr 16 17:18] Iteration: 25340000 / 48995293 [51.71%]. Per iteration time: 0.039 sec.
[Apr 16 17:18] Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4.
[Apr 16 17:18] Confidence in final result is fair.
[Apr 16 17:25] Iteration: 25350000 / 48995293 [51.73%]. Per iteration time: 0.040 sec.
[Apr 16 17:25] Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4.
[Apr 16 17:25] Confidence in final result is fair.
pjaj is offline   Reply With Quote
Old 2011-04-16, 17:46   #4
S34960zz
 
Feb 2011

22·13 Posts
Default

See also: http://www.mersenneforum.org/showpos...3&postcount=71
Quote:
Originally Posted by Prime95 View Post
This is normal -- a result of a new "feature" in v26. In v25, one would get a SUM(INPUTS) error or ROUNDOFF > 0.4 error and it would scroll off the screen unnoticed. You had to go to the effort of looking in results.txt to see that you had a problem.

In v26, every time prime95 does its regular screen update it prints out a summary of the total number of errors that occurred during the test. See undoc.txt for options on controlling this new feature.

So, one of your workers had an SUM(INPUTS) error sometime during the test. Since it only happened once there is a fair chance that your LL result will be OK.
See also: undoc.txt
Code:
You can control how the "count of errors during this test" message
is output with every screen update.  These messages only appear if
possible hardware errors occur during a test.  In prime.txt set:
    ErrorCountMessages=0, 1, 2, or 3
Value 0 means no messages, value 1 means a very short messages, value 2
means a longer message on a separate line, value 3 means a very long message
possibly on multiple lines.  Default value is 3.

Last fiddled with by S34960zz on 2011-04-16 at 17:48
S34960zz is offline   Reply With Quote
Old 2011-04-16, 17:54   #5
moebius
 
moebius's Avatar
 
Jul 2009
Germany

32×29 Posts
Default

And what about this mysterious phenomen? At First time LL-Test (with Mprime26.5-linux64 ) of M42818549 running on 1 Worker (core) twelve similar Round OFF Error occured.The other 3 LL Tests who are running on the 3 other cores are without errors.

[Tue Mar 8 07:31:28 2011]
Iteration: 18872759/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 07:55:14 2011]
Iteration: 18903381/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 08:09:58 2011]
Iteration: 18916265/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 08:58:00 2011]
Iteration: 18989396/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
Iteration: 18977799/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 10:29:45 2011]
Iteration: 19103776/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 11:19:09 2011]
Iteration: 19171522/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 21:32:58 2011]
Iteration: 19959306/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 21:39:15 2011]
Iteration: 19956922/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
Iteration: 19953354/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 21:50:22 2011]
Iteration: 19963829/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.
[Tue Mar 8 23:32:26 2011]
Iteration: 20067548/42818549, ERROR: ROUND OFF (0.5) > 0.40
Continuing from last save file.

..........led to a Suspect LL.
moebius is online now   Reply With Quote
Old 2011-04-16, 21:01   #6
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

23·5·173 Posts
Default

That core looks pretty flaky to me. Any better luck on the next exponent.
Prime95 is online now   Reply With Quote
Old 2011-04-16, 23:05   #7
pjaj
 
pjaj's Avatar
 
Oct 2010

3·5 Posts
Default

So just to clarify - what has happened is that there was only one error (so far), but the prime95 worker keeps a cumulative score, and will report the same error every time it logs a new batch of 10,000 iterations on the screen. If there are subsequent errors then they will show up as separate entries in results.txt as in moebius post and the worker would report

"Possible hardware errors have occurred during the test! N ROUNDOFF > 0.4"

if N errors occurred?
pjaj is offline   Reply With Quote
Old 2011-04-17, 02:02   #8
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

692010 Posts
Default

You understand correctly
Prime95 is online now   Reply With Quote
Old 2011-04-17, 16:56   #9
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5·359 Posts
Default

Is it worth manually assigning either of these exponents to one of my machines, knocking out some ECM progress?
Christenson is offline   Reply With Quote
Old 2011-04-22, 01:12   #10
Rhyled
 
Rhyled's Avatar
 
May 2010

32×7 Posts
Arrow More stressful than Prime95

You might want to run the latest IntelBurn test. It's even tougher on the processor than Prime95, and identifies calculation errors in an hour or so. Crank the memory setting up close to maximum, and it will hit your processor harder than Prime95. I.e. if it passes this stress test, you won't have a hardware issue with running LL tests.

http://www.softpedia.com/get/System/...BurnTest.shtml
Rhyled is offline   Reply With Quote
Old 2011-07-19, 11:52   #11
SeeD419
 
Jul 2011
Omaha, NE

23 Posts
Default

Damnit I've been running this test for so long, and it figures this POS laptop locks up and now Prime spews the error message every line just to remind me about it.

It first occurred at 70%.

My question is - If there was an error in the calculation, why doesn't prime have some sort of 'save point', and recalculate from the last known good numbers that it was at? What am I supposed to do about it? I figured Prime would have some sort of 'checkpoint' it could revert to.
SeeD419 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Possible hardware errors have occurred during the test! 1 ROUNDOFF > 0.4. Xyzzy Software 7 2016-12-20 00:01
Lots of roundoff errors TheMawn Software 18 2014-08-16 03:54
Memtest86+ shows no errors but computer crashes with Prime95 TObject Hardware 11 2013-05-09 11:43
Roundoff error bcp19 Software 4 2013-02-14 21:23
Roundoff Error Penalty nevarcds Software 5 2004-08-28 14:29

All times are UTC. The time now is 18:48.

Wed Jul 15 18:48:08 UTC 2020 up 112 days, 16:21, 2 users, load averages: 1.60, 1.83, 1.90

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.