mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2011-02-26, 00:44   #1
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

230210 Posts
Default error rate and mitigation

As we know, each LL test has a small chance of returning an incorrect residue. From what I've seen, the error rate is in the neighborhood of 0.01 to 0.05. Suppose that the average rate is around 0.01 for a 35M exponent (per this thread).

Because the result of each iteration depends on that of the previous one, a single error will result in the entire test being incorrect. Thus, every single iteration will need to be correct in order for the final residue to be currect. Assuming a 0.99 chance of a good LL test, each iteration would have a p = 0.99999999971 chance of being correct.

It would be reasonable to assume that each iteration has an equal chance of generating the correct residue. In terms of statistics, the 0.01 probability of a bad LL test is equal to:

\sum_{i=1}^{n} {n\choose i}q^i(1-q)^{n-i}

where n is the number of iterations, and q = 1 - p.

All else being equal, the error rate will increase as the exponent increases. In actuality, the error rate will probably be even higher because longer iterations will have a higher chance of going bad. For example, a 350M exponent would probably have less than a 0.9 chance of being correct.

That having been said, what kind of factors can lead to bad LL tests? From what I gather, the most common causes are:
  • Incorrectly configured hardware
  • Bad memory
  • Overheating
  • Overclocking

However, does anyone know if the following can affect the error rate?
  • Frequent restarts
  • OS freezing
  • Frequent slowdowns by other applications
  • Using old hardware

As for ways to mitigate such errors, does anyone effective are Prime95's built-in error-checking features? In other words, can I expect a much lower error rate if I enabled roundoff and/or SUM(INPUTS) error-checking? Do both of these features add security, or are they redundant? That is, would it be a good idea to enable both?

I've also heard that ECC memory can help reduce the risk of errors. Does anyone have any experience with them?
ixfd64 is offline   Reply With Quote
Old 2011-02-26, 05:53   #2
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

1000111111102 Posts
Default

I forgot to mention the following factors:
  • Algorithm choice - for example, the Schönhage–Strassen algorithm and Fürer's algorithm have approximately the same time complexity. Would using one or the other increase the chance of getting an incorrect result?
  • Frequent hibernation (for notebooks)
ixfd64 is offline   Reply With Quote
Old 2011-02-28, 18:54   #3
lycorn
 
lycorn's Avatar
 
Sep 2002
Oeiras, Portugal

1,399 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
In other words, can I expect a much lower error rate if I enabled roundoff and/or SUM(INPUTS) error-checking? Do both of these features add security, or are they redundant? That is, would it be a good idea to enable both?
You may wish to read the following thread:

http://www.mersenneforum.org/showthr...t=13476&page=1
lycorn is offline   Reply With Quote
Old 2011-04-11, 22:06   #4
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

2·1,151 Posts
Default

I still have a few unanswered question, though. For example, would frequent reboots have an impact on the error rate? Would a different algorithm be more or less likely to generate an incorrect result?
ixfd64 is offline   Reply With Quote
Old 2011-04-12, 02:14   #5
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5·359 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
I still have a few unanswered question, though. For example, would frequent reboots have an impact on the error rate? Would a different algorithm be more or less likely to generate an incorrect result?
Frequent reboots would have an impact on the error rate if they were themselves being caused by either hardware errors or if the OS was sufficiently buggy to cause corruption of program data. Otherwise, these programs go out to the last saved checkpoint (which they may have written as part of the shutdown process) and restart from the checkpoint.

The algorithm with the fastest time to run or smallest memory footprint, all else being equal, has the least likelihood of an incorrect result since it minimises the opportunity for error. Beyond that, failure of TF,ECM, or P-1 to find a factor when one should be found results in the LL and LL-D checks, and, as a result, has a negligible effect on the likelihood of failing to identify a mersenne prime. If these processes result in factors, those are checked quite quickly by the server when reported, I believe.
Christenson is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error rate plot patrik Data 109 2020-01-09 18:43
Bad LL-D Success Rate TheMawn Data 14 2014-10-13 20:19
EFF prize and error rate S485122 PrimeNet 15 2009-01-16 11:27
What ( if tracked ) is the error rate for Trial Factoring dsouza123 Data 6 2003-10-23 22:26
Error rate for LL tests GP2 Data 5 2003-09-15 23:34

All times are UTC. The time now is 16:57.

Mon Sep 21 16:57:18 UTC 2020 up 11 days, 14:08, 1 user, load averages: 1.62, 1.72, 1.66

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.