Thread: Error rate for LL tests View Single Post
 2003-09-15, 17:12 #1 GP2     Sep 2003 50248 Posts Error rate for LL tests We estimate error rate as follows: - Every single line in BAD is a separate verified-bad result - Every single line in LUCAS_V.TXT is a separate verified-good result. - Lines in HRF3.TXT handled as described below: The file HRF3.TXT contains unverified results (only one LL test, or more than one but with non-matching double-checks). How do we estimate the error rate for these results? Any exponent that occurs only once must be ignored: we have no idea whether it is a good or a bad result. However, when an exponent occurs N times (in N separate lines of HRF3.TXT), we know for sure that there are N distinct non-matching residues returned (otherwise there would have been a match and the results would have been removed from HRF3.TXT and moved to the files BAD and LUCAS_V.TXT), and therefore at least N-1 of them must be bad, and the remaining one could be good or bad. The odds are, that remaining one result is good (only we don't yet know which of the N it is). After all, the error rate is relatively low, so the odds of N-1 bad + 1 good are much larger than the odds of all N bad. In the most common case of 2 separate lines in HRF3.TXT for the same exponent, in most cases one will be good and one will be bad and a triple-check will sort out which is which. So, to summarize: - If an exponent occurs in only one line in HRF3.TXT, ignore it. - If an exponent occurs in N separate lines in HRF3.TXT, assume one good result and N-1 bad results. Error rates for the various exponent ranges are: [Sept 9 2003 data] Code:  0 - 999,999 (163+0-0)/(163+0+70581) = .002 1,000,000 - 1,999,999 (718+0-0)/(718+0+58971) = .012 2,000,000 - 2,999,999 (1203+0-0)/(1203+0+54591) = .021 3,000,000 - 3,999,999 (1465+0-0)/(1465+0+52939) = .026 4,000,000 - 4,999,999 (1837+0-0)/(1837+0+51026) = .034 5,000,000 - 5,999,999 (1905+0-0)/(1905+0+49346) = .037 6,000,000 - 6,999,999 (1804+0-0)/(1804+0+49253) = .035 7,000,000 - 7,999,999 (1956+27-12)/(1956+27+47579) = .039 8,000,000 - 8,999,999 (1612+500-235)/(1612+500+45865) = .039 9,000,000 - 9,999,999 (625+1312-639)/(625+1312+33724) = .036 10,000,000 - 10,999,999 (53+1369-672)/(53+1369+2978) = .170 11,000,000 - 11,999,999 (50+1384-679)/(50+1384+1993) = .220 12,000,000 - 12,999,999 (31+1819-895)/(31+1819+1415) = .292 13,000,000 - 13,999,999 (33+1611-798)/(33+1611+1392) = .278 14,000,000 - 14,999,999 (4+1541-764)/(4+1541+1172) = .287 15,000,000 - 15,999,999 (2+1091-541)/(2+1091+796) = .292 16,000,000 - 16,999,999 (0+757-375)/(0+757+598) = .281 17,000,000 - 17,999,999 (0+134-67)/(0+134+233) = .182 18,000,000 - 18,999,999 (0+86-43)/(0+86+174) = .165 19,000,000 - 19,999,999 (2+32-16)/(2+32+40) = .243 20,000,000 - 20,999,999 (1+2-1)/(1+2+14) = .117 We note: - The results for the low exponents have very low error rates. Maybe this is because the run time is very short, or maybe for such old results the bad results were purged or not recorded. - The results for the higher exponents are artificially high. This is because when the server gets a result returned with a nonzero error code, it automatically reassigns that exponent for another first-time LL test without waiting a couple of years for regular double-checking to catch up to the current first-time range. Thus, a significant fraction of bad results are caught much sooner, but good results are not verified until perhaps years later. Note: it is possible for a nonzero error code to still yield a good result and it is possible for a zero error code to yield a bad result. See the Most popular error codes thread. Since the current leading edge of double checking is around 10.1M, all error rates above this are artificially high for the time being. We also note: So far, there is no evidence that error rates are increasing for larger exponents. The error rate remains steady around 3.5% - 4.0% over a broad range of exponents. Larger exponents have longer run times and thus we might expect more errors, but on the other hand newer machines run Windows XP and other modern operating systems with much better memory protection. So perhaps these effects cancel each other. Note that this error rate of 3.5% - 4.0% is an average over all users and computers. Some computers have a 0% error rate, others have a high double-digit error rate. This depends on hardware issues, memory quality, CPU temperature, etc. Finally, we might ask, what do we get if we only consider results returned by programs Wxx (George Woltman's Prime95/mprime) and ignore results returned by other programs? The answer is: almost exactly the same. Error rates for the various exponent ranges, taking into account only results returned by programs Wxx (George Woltman), are: [Sept 9 2003 data] Code:  0 - 999,999 (85+0-0)/(85+0+30303) = .002 1,000,000 - 1,999,999 (552+0-0)/(552+0+45733) = .011 2,000,000 - 2,999,999 (1171+0-0)/(1171+0+52421) = .021 3,000,000 - 3,999,999 (1445+0-0)/(1445+0+52267) = .026 4,000,000 - 4,999,999 (1779+0-0)/(1779+0+49024) = .035 5,000,000 - 5,999,999 (1891+0-0)/(1891+0+48521) = .037 6,000,000 - 6,999,999 (1793+0-0)/(1793+0+47982) = .036 7,000,000 - 7,999,999 (1945+27-12)/(1945+27+46770) = .040 8,000,000 - 8,999,999 (1602+500-235)/(1602+500+45616) = .039 9,000,000 - 9,999,999 (622+1312-639)/(622+1312+33397) = .036 10,000,000 - 10,999,999 (53+1369-672)/(53+1369+2964) = .170 11,000,000 - 11,999,999 (49+1384-679)/(49+1384+1984) = .220 12,000,000 - 12,999,999 (30+1819-895)/(30+1819+1405) = .293 13,000,000 - 13,999,999 (33+1611-798)/(33+1611+1330) = .284 14,000,000 - 14,999,999 (4+1541-764)/(4+1541+1147) = .290 15,000,000 - 15,999,999 (1+1091-541)/(1+1091+781) = .294 16,000,000 - 16,999,999 (0+757-375)/(0+757+581) = .285 17,000,000 - 17,999,999 (0+134-67)/(0+134+225) = .186 18,000,000 - 18,999,999 (0+86-43)/(0+86+168) = .169 19,000,000 - 19,999,999 (2+32-16)/(2+32+40) = .243 20,000,000 - 20,999,999 (1+2-1)/(1+2+12) = .133