![]() |
Error risk after doublecheck
I have some questions regarding the LL-error rate and the safety feature of the res64. Just curious. I'm not good in calculating probabilities...
Here we go: - What's the chance that we have at least one undetected matching but still false doublecheck in the LL database? So both residues are x but should be y. - How many exponents would we have to test so the chance from the question above would rise above 1% ? |
My understanding (and I hope to be corrected if I am wrong) is that the res is a 16 digit Hex number. To get 2 to match accidentally would take odds of (well) over 1 x 10^40, which is the point that it is considered to be impossible. And since the projected starting error rates are ~1%, you need to square that 0.01% chance that the first two res turned in are both wrong.
Ain't going to happen. Unless both were run on thee same error producing machine. |
[QUOTE=MrHappy]How many exponents would we have to test so the chance from the question above would rise above 1% ?[/QUOTE]
I make it 9.3 x 10^18, based on Uncwilly's 1% failure figure. Dave |
Let's see where (or how) my math breaks down.
16 digits each representing 4 bits. 16 * 4 = 64 bit residue. 2^64 = 18446744073709551616 (total different potential residues) 18446744073709551616^2 = 3.4028e+38 (total different combinations of 2 residues) divide by 0.0001 chance both tests having errors and we get: 3.4028e+42 |
[QUOTE=Uncwilly]Let's see where (or how) my math breaks down.
2^64 = 18446744073709551616 (total different potential residues) 18446744073709551616^2 = 3.4028e+38[/QUOTE] It breaks down here. You have calculated the probability that two independent failures will have a residue of "192." But a matching residue of 191 or 193 or 1234567890 would also count, so you must multiply by (2^64-1) for all the possible wrong but unrecognized residues. Alternatively, you can calculate as First error * Second Error * second residue matches first residue |
First, I would like to point out that it seems Uncwilly and dave_dm have made an unspoken assumption: that an erroneous residue is equally likely to take on any hexadecimal value. This assumption is necessary to come up with figures at all (unless a more realistic model is available), but it should be kept in mind that there may be some types of CPU errors may not fit this assumption.
In any case, this is how we can come up with the figures. Suppose we actually did two incorrect LL tests of the same exponent. Let x be the residue of the first test. For the second residue, there are 16^16 possible values for the residue (because there are 16 "digits" and each "digit" has 16 possible values). Thus, the probability that both results are the same is: 1/(16^16) ~= 5.42101086242752217003726400434971 * 10^-20 Now, we can use a theorem in probability. If the probability that an outcome occurs in one trial is p, then the probability that it doesn't happen is 1 - p. If this trial occurs n times, the probability that the outcome never occurs is (1 - p)^n. Thus, the probability that it occurs at least once is 1 - (1 - p)^n. According to [url]http://www.mersenne.org/status.htm[/url], GIMPS has double-checked a total of 315,179. So for your first question, p = 1/(16^16) and n = 315,179. According to Mathematica, the probability of having at least one such error due to chance is thus roughly: 1.7085887826090294136983459361845392 * 10^-14 That's less than one in 58.5 trillion, nothing to get too worried about (if you trust the assumption I pointed out at the beginning). Now, to answer your second question, we solve for n in the equation: 1 - (1 - p)^n = 0.01 where p = 1/(16^16) (1 - p)^n = 0.99 take the log of both sides: n*log(1 - p) = log 0.99 n = (log 0.99)/(log (1 - p)) Mathematica gives: 1.853959733443683384907575121974264716245215072 * 10^17 Assuming GIMPS double-checked every possible Mersenne number with a prime exponent in order from (2^2) - 1 upwards, in order for the probability of an error to reach 1%, GIMPS would need to search every such exponent up to about: M(7.87 * 10^18) In my opinion, a far greater threat is that a user will deliberately submit incorrect residues to get GIMPS credit. To do this, he could choose any residue he wanted, submit it once manually as a first time LL test, then resubmit it manually as a double-check. This could be difficult to catch, especially if he created many user accounts, and never submitted a first time test and its double check using the same account. |
Isn't there the little "check sum" or whatever it is at the end of each result line?
I am not exactly sure what it is used for but I always assumed it was there so that people couldn't submit "made up" results. |
[QUOTE]Isn't there the little "check sum" or whatever it is at the end of each result line?[/QUOTE]
The checksum can be forged. |
[QUOTE=jinydu]First, I would like to point out that it seems Uncwilly and dave_dm have made an unspoken assumption: that an erroneous residue is equally likely to take on any hexadecimal value. This assumption is necessary to come up with figures at all (unless a more realistic model is available), but it should be kept in mind that there may be some types of CPU errors may not fit this assumption.[/QUOTE]
Yup :) The figure I worked out used the same argument as jinydu but instead I worked out the probability of an undetected hardware error, having misparsed the question. So with p = 0.01, the probability of two residues matching but some hardware error occuring is q = 2 * p * (1 - p) * 2^(-64) + p^2 * 2^(-64) i.e. {one hardware error} + {two hardware errors} Then log(0.99) / log(1 - q) gives the 9.3 x 10^18. Dave |
Ok, so 10^18 is "Ain't gonna happen", while 10^40 is mathametically impossible. :)
|
[QUOTE=jinydu]In my opinion, a far greater threat is that a user will deliberately submit incorrect residues to get GIMPS credit. To do this, he could choose any residue he wanted, submit it once manually as a first time LL test, then resubmit it manually as a double-check. This could be difficult to catch, especially if he created many user accounts, and never submitted a first time test and its double check using the same account.[/QUOTE]
Isn't someone going to come to defend the current system? Seriously, there must be some safeguard against this, right? |
[QUOTE=jinydu]Isn't someone going to come to defend the current system? Seriously, there must be some safeguard against this, right?[/QUOTE]1) The security code (not really a checksum) is the principal safeguard. But the security code can be faked.
2) Behind the scenes, there is triple-checking (of all "verified" [matching first-time and doublecheck] results) going on by one or more folks, but its current progress is way behind the trailing edge of doublechecking, and it'll be several years before a current double-fake is detected. But real mathematicians are patient. :-) |
But all verified exponents are ultimately triple-checked by trusted people, right?
At least its only possible to fake a composite. A prime is checked several times immediately, so any attempt to fake it would be caught immediately. Still, it would be sad if someone faked an exponent as composite, but it was in fact prime, and that went undiscovered for years. |
The triple-checking effort is only done on exponents that were tested and double-checked by the same userid.
|
Then what safeguard is there against the potential problem I mentioned above?
|
There are only 2 safeguards against reporting a false composite and matching double-check.
1) The security code or checksum is hard to forge. This is the only source code that is not published. However, anyone handy with a disassembler could fake it. 2) There is no glory in pulling off the stunt. You won't get famous. You can't really climb the stats chart because you have to use different userids for the two tests -- and if tens or hundreds of tests and doublechecks came in from the same user that would be suspicious too. Any ideas for improving security are, of course, welcome |
[QUOTE=cheesehead]Behind the scenes, there is triple-checking (of all "verified" [matching first-time and doublecheck] results) going on by one or more folks,[/QUOTE][quote=Prime95]The triple-checking effort is only done on exponents that were tested and double-checked by the same userid.[/quote]Oh, rats. That's what I get for not going back to check the Mersenne Digest postings I recalled. I still can't find my copy right now, but I think what I referred to above was a triple-checking of all exponents for which GIMPS had only one 64-bit residue recorded. That is, folks [Brian Beesley and others?] were running another GIMPSian LL test on all the low exponents that had been tested before GIMPS then doublechecked by GIMPS, but for which all pre-GIMPS tests had recorded residues of fewer than 64 bits (and the GIMPS doublechecks had matched as many residue bits as were recorded for the earlier tests).
|
Unfortunately, I can't think of a surefire way of catching someone who tried such a strategy. The best method I can think of is for trusted people to do regular, random triple-checks of double-checked exponents. Then, if any discrepancies are confirmed, put extra scrutiny on the two offending user accounts. If you find consistently incorrect residues in a small group of accounts that are always checking each other, that could be cause for suspicion.
|
[QUOTE=cheesehead]Oh, rats. That's what I get for not going back to check the Mersenne Digest postings I recalled. I still can't find my copy right now, but I think what I referred to above was a triple-checking of all exponents for which GIMPS had only one 64-bit residue recorded.[/QUOTE]
You're right. Brian makes sure all exponents have two 64-bit residues returned. He may have completed that project by now. He also does the triple-checking I mentioned. |
Yes, the project for making sure every exponent had matching 64-bit residues was completed almost two years ago. The project for triple-checking exponents with results returned by the same userID is ongoing but is keeping pace with the doublechecking effort.
|
| All times are UTC. The time now is 22:32. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.