mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2003-11-08, 06:33   #1
GP2
 
GP2's Avatar
 
Sep 2003

A1516 Posts
Default Release more exponents for first-time testing?

There are are several categories of exponents that could qualify for a second-wave release of first-time exponents:


1) Repeat the release of exponents according to the earlier criteria of error-prone machines with bad/(bad+good) >= 0.333 and bad >= 2. The previous release imposed a maximum of no more than 4 exponents per machine, so some exponents still remain (some machines have a large number), and presumably there may be a few others from newly-detected error-prone machines.

Perhaps the limit of 4 exponents per machine should be raised, or perhaps the threshold of 0.333 could be lowered to 0.250.


2) Repeat the release of exponents according to "harmful" errorcodes (errorcode not 00000000, AB00AB00, or 00XX0000), which statistically indicates a 55-60% error rate.

Within this category, there are about 300 exponents that show up under the Double Checking Avail column of http://www.mersenne.org/primenet/ (starting at 12.5M) which have expired in the last month alone. These were already re-released as first-time checks (many in June and July), got automatically converted to double-checks in September by the server sync, and have since expired. A few more such exponents expire every day.

3) Make use of uv3 information, where uv3 represents unverified exponents that need a triple-or-higher check.

In general, whenever a machine has returned a lot of results that need triple checks, this is an indication that the machine is probably error-prone. Most machines have a 0% error rate, so if a machine has returned a number of results that require a triple check, it's much more likely that that machine has made N errors rather than N other different, independent machines have each made one error.

However, a few machines such as GW/P4-1400 seem to specialize in doing double-checks of exponents for which the first result contained a "harmful" error code. Thus, they have accumulated a lot of triple-check-required exponents, but only because they have deliberately sought probably-erroneous results to double check.

We can take advantage of the fact that HRF3.TXT now contains error-code information, and look for results that require a triple check AND have a harmful error code. Call these uv3h.

So we can use an additional set of criteria to identify error-prone machines:

(bad + uv3h) / (bad + uv3 + good) >= 0.333
and
(bad + uv3h) >= 2

This prevents machines like GW/P4-1400 from being misidentified as error-prone, because although they have a lot of results that require a triple-check, few or none of them were returned with a harmful error code.
GP2 is offline   Reply With Quote
Old 2003-11-08, 09:18   #2
outlnder
 
outlnder's Avatar
 
Aug 2002

2·3·53 Posts
Default

Is there any way to have a Double Check done on every machine that is still producing??

This would give an immediate verification of good machines versus bad.

Any machine failing this Double Check would have another done quickly to confirm it error-prone or not.

I would think that this would help find error-prone machines and confirm to crunchers that their machines are good.
outlnder is offline   Reply With Quote
Old 2003-11-08, 19:51   #3
GP2
 
GP2's Avatar
 
Sep 2003

29·89 Posts
Default

Quote:
Originally posted by outlnder
Is there any way to have a Double Check done on every machine that is still producing??

This would give an immediate verification of good machines versus bad.
outlnder,
I started a new thread to discuss this.
GP2 is offline   Reply With Quote
Old 2003-11-08, 23:14   #4
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1BDD16 Posts
Default

GP2 - a.k.a the data miner,

Send me or post the list of exponents that have the bad error codes and I'll set them to be released (your case 2)

Also, if you think you have enough data to back up your theories regarding identifying computers with at least a 33% error rate, then remove the 4 exponent restriction and send me a list of exponents to test.

As to case 3, let's release the case 1 and case 2 exponents before casting a wider net for bad results.
Prime95 is offline   Reply With Quote
Old 2003-11-09, 19:53   #5
GP2
 
GP2's Avatar
 
Sep 2003

258110 Posts
Default

Quote:
Originally posted by Prime95
Also, if you think you have enough data to back up your theories regarding identifying computers with at least a 33% error rate, then remove the 4 exponent restriction and send me a list of exponents to test.
It's a fairly solid theory, since we're simply measuring the verified-bad and verified-good results actually returned by the machine. All we're assuming is that the past error rate for any given machine is a good indication of its future error rate.

The uv3 stuff (case 3) is more speculative at this point, I don't yet have the data to back it up although it seems very plausible.


OK, here goes.

First of all, recall the definitions:

bad: count of verified-bad results returned by the machine.
good: count of verified-good results returned by the machine.
uv2: count of unverified results returned by the machine that need a 2nd check (no double check has been done).
uv3: count of unverified results returned by the machine that need a 3rd-or-higher check (one or more unmatching double checks have been done).


Based on the just-released Nov 9 2003 data files, the attached file shows the error-prone machines that fit the criteria
bad/(bad+good) >= 0.333
bad >= 2
and the file shows only those machines with
u2 > 0

For brevity, we show only the 230 error-prone machines that have uv2 > 0, not the full set of 1145 error-prone machines. Those remaining machines might contribute exponents only in the rarer case where triple checks are required because both original results are presumed not good.

NOTE: the user names in this list of machines are "mapped" names. They don't necessarily correspond to the user name that was used when the result was returned to PrimeNet.


Note uv2 is the count of how many exponents from that machine contribute to the release of exponents (excluding the rarer case of triple checks where both original results are not presumed good). We see that most machines contribute only a small number of exponents, but a few contribute dozens.
Attached Files
File Type: zip error_prone_machines-20031109.zip (3.5 KB, 180 views)
GP2 is offline   Reply With Quote
Old 2003-11-09, 20:05   #6
GP2
 
GP2's Avatar
 
Sep 2003

29·89 Posts
Default

There are 1220 exponents that satisfy the criteria of case 1 (note we have lifted the restriction of only 4 exponents per machine) for exponents that need a 2nd check, limited to the range 10.5M - 19M.

Of these 381 are already assigned (many of these in the last release) and 46 are currently cleared, leaving 793 exponents.

Note: if you sum up the uv2 column of the previous attachment, it adds up to 1436. The discrepancy is because there are 87 exponent that would have qualified but are below the 10.5M cutoff, and 129 exponents that would have qualified but are above the 19M cutoff.

For tracking purposes here is the complete list of 1220.

Note: this is not the set of exponents to release, since it includes already-assigned exponents. That will follow in the next few posts.
Attached Files
File Type: zip 1220.zip (4.6 KB, 159 views)

Last fiddled with by GP2 on 2003-11-09 at 20:49
GP2 is offline   Reply With Quote
Old 2003-11-09, 20:20   #7
GP2
 
GP2's Avatar
 
Sep 2003

29×89 Posts
Default

There are 3240 exponents that meet the criteria of case 2 (harmful errorcodes) for exponents that need a 2nd check, limited to the range 10.5M - 19M.

Of these, 2753 are already assigned and 177 are cleared, leaving 310.

Those 310 are attached here (note there may be overlap with the previous file attachment).
Attached Files
File Type: zip 310.zip (1.3 KB, 171 views)

Last fiddled with by GP2 on 2003-11-09 at 20:38
GP2 is offline   Reply With Quote
Old 2003-11-09, 20:28   #8
GP2
 
GP2's Avatar
 
Sep 2003

29×89 Posts
Default

There are 187 exponents that need a triple-or-higher check because all previous results are presumed not good by the criteria of case 1 and case 2.

Of these, 97 are already assigned and 41 are cleared, leaving 49.

For tracking purposes, the complete set of 187 is attached here.
Attached Files
File Type: zip 187.zip (881 Bytes, 155 views)

Last fiddled with by GP2 on 2003-11-09 at 20:39
GP2 is offline   Reply With Quote
Old 2003-11-09, 20:30   #9
GP2
 
GP2's Avatar
 
Sep 2003

1010000101012 Posts
Default

Finally, when all is said and done and the above are combined, and everything is filtered against status.txt and cleared.txt as of Nov 9 2003 19:00 UTC, we are left with 1144 exponents.

This consists of 793 double-check exponents from case 1 (error-prone machines), 310 double-check exponents from case 2 (harmful error codes), and 49 triple checks. There's an overlap of 8 exponents between the first and second categories.

This is the set of exponents I'm proposing for release:
Attached Files
File Type: zip 1144.zip (4.3 KB, 173 views)

Last fiddled with by GP2 on 2003-11-09 at 20:35
GP2 is offline   Reply With Quote
Old 2004-01-02, 03:41   #10
GP2
 
GP2's Avatar
 
Sep 2003

29·89 Posts
Default

Hmm, I realized I haven't posted any thread tracking this November release of exponents, although there is a Tracking the October 2003 release of error-prone-machine exponents.

So this message will be a placeholder until I do that and link to the new thread.
GP2 is offline   Reply With Quote
Old 2004-01-02, 04:17   #11
GP2
 
GP2's Avatar
 
Sep 2003

29×89 Posts
Default

I'd like to revisit the issue of "case 3", making use of uv3 and uv3h information to identify error-prone machines.

Starting with the October 9 2003 release of "weekly" data files, HRF3.TXT started showing error codes. Enough time has passed since then so that some of those entries in HRF3.TXT have now made it to LUCAS_V.TXT and BAD (they've been verified as either good or bad results). So we can do some analysis.

Recall that uv3 (uv stands for "unverified") counts the number of results in HRF3.TXT for which:
1) a triple-check (or higher) is required -- only non-matching results (at least two or more) have been returned for that exponent)

and uv3h counts the number of results in HRF3.TXT which satisfy both 1) and:
2) a "harmful" error code (other than 00000000, AB0000AB00, 00XX0000)

Note that in general, a result with a "harmful" error code has about a 55%-60% chance of turning out to be bad. Now we might ask, what about a uv3h result -- a result with both a "harmful" error code and a triple-check required?


Comparing October 9 2003 and December 31 2003 data files, we see that a uv3h result has a 92.5% chance of being bad.

Namely, out of 400 uv3h results in the October 9 2003 version of HRF3.TXT that have been verified by December 31 2003, 370 turned out to be bad, 29 turned out to be good, and 1 was an apparently discarded duplicate. For more details, see the attached file.

This means we can confidently use uv3h information to detect error-prone machines.

Consider the following entries in usercomp_stats.txt (this can be downloaded from http://opteron.mersenneforum.org/dat...comp_stats.zip).

Code:
uv3h	uv3	uv2	bad	good	user ID/computer ID
----	---	---	---	----	-------------------
14      14      8       1       3       CHS,c925bba2a
11      11      17      0       2       DarekM,DAREKM12
12      12      18      1       1       K_R,katie11
11      11      15      0       1       S110459,xeon1
13      13      14      0       1       S18163,Whizkey
11      11      6       0       0       S47997,Zappa
13      13      13      0       13      Straker,Moonbase
11      11      18      0       0       Woodburner,woodbury2a
11      11      17      0       1       balu,balu
12      12      5       0       0       bourit,info11
14      14      15      0       0       cowanm,babbage7b
11      11      34      0       2       ddarknell,Atlas
11      11      14      0       0       fzttk8,C70BB7B21
12      12      41      0       3       gczajkow,il09P4
11      11      86      0       1       imad,celia
15      15      10      0       1       maathome,AMD3
13      13      13      0       9       stigmv,Frodo
Consider for instance the machine imad/celia: there are 0 bad results and 1 good result, so we can't really tell much from this alone whether it's an error-prone machine. But note that it has 11 uv3 results and 100% of them are also uv3h results. And we now know that 92.5% of those are probably bad. Thus we can identify imad/celia as an error-prone machine, and release its 86 uv2 exponents for early double-checking.

So, to summarize, if we do another release of early double-checking exponents some time this month, I think we can confidently include case 3 (see the first post in this thread for explanation of cases 1 to 3).

The more thoroughly we do early double-checking of error-prone results, the more confident we will be that there is no missing Mersenne prime below the current M40.
Attached Files
File Type: zip uv3h_verified.zip (15.4 KB, 161 views)
GP2 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Tracking the November 2003 release of error-prone exponents nfortino Data 2 2005-07-25 13:46
Tracking the October 2003 release of error-prone-machine exponents GP2 Data 10 2005-07-25 13:14
Release of Exponents nitro Lone Mersenne Hunters 3 2004-01-02 06:41
How can i get Prime95 to not release exponents? E_tron Software 5 2003-12-19 02:59
Exponents to re-release for first-time testing: "harmful" errorcodes GP2 Data 4 2003-10-18 22:08

All times are UTC. The time now is 06:50.

Thu Oct 22 06:50:26 UTC 2020 up 42 days, 4:01, 0 users, load averages: 1.50, 1.40, 1.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.