![]() |
|
|
#133 | |
|
Serpentine Vermin Jar
Jul 2014
63618 Posts |
Quote:
There are false positives, for sure, where a test has returned "is prime" even though it isn't. And we catch those. Although we haven't seen any, there can be false negatives as well. And that's why we do double-checks and compare the 64-bit residue of that final iteration. Barring something going terribly awry with the LLR itself, it's highly improbable for two false negatives to arrive at the same 64-bit residue. Maybe different ones, but surely not the same one. This happens often with a regular "is composite" where a triple or even more tests are needed because the first couple didn't match. The tricky bit, which the programmers would weigh in on, is whether it's possible for some flaw in the LLR itself to "miss" a prime result, and do so in a repeatable way. Some odd quirk in either the implementation, the hardware, the chipset, etc. I think issues like that are tested for and can be caught (roundoff errors, etc). The different code bases go through a lot of testing. And there was that one time way back when that a bug in Prime95 was discovered and the tests from that version were re-done. Don't remember what version...long time ago. EDIT: found it... it was version 17. Last fiddled with by Madpoo on 2018-12-19 at 07:12 |
|
|
|
|
|
|
#134 | |
|
∂2ω=0
Sep 2002
República de California
19·613 Posts |
Quote:
Lastly, we've found far more M-primes than statistically expected - were we missing a bunch (or even a handful) of primes with p == 3 (mod 4), the actual prime count would be *really* out of whack vs the theory. But I can see where you are coming from - it's alas an aspect of proof-by-computer that only a handful of people (via their written code) are in a position to check a given result. It's a far cry from the days when any kid could cut some squares and rectangles of proper size & shape and reassemble them so as to replicate Pythagoras' proof of the theorem which now bears his name. Last fiddled with by ewmayer on 2018-12-19 at 07:47 |
|
|
|
|
|
|
#135 | |||||
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
152B16 Posts |
Quote:
We thought we were doing all that could be done. Errors creep through. New approaches get identified, implemented, tested, and rolled out, such as the Jacobi test for LL, and adoption of PRP3 with Gerbicz check. Bugs occasionally get found after release, identified, disclosed, fixed, etc. We'll keep iterating, leaving no stone unturned. My crude diagram for the ancestry of the various codes is attached at https://www.mersenneforum.org/showpo...04&postcount=5 (Corrections or additions are invited. My own independently developed code begun in the 1980s is not included, because its run time scaling is terrible. Trial factoring inefficiently, and LLtest via schoolboy multiplication with integer arrays in C, was as far as I got before I learned of the far more advanced much faster prime95 and collaborative systematic search operating through a mailing list and range assignments, and quit a solo effort on my own slow code and a handful of systems.) It's my understanding, from reviewing and summarizing the various authors' descriptions, to produce that parentage diagram, that gpuowl, Gpulucas, and Mlucas are unrelated to the rest and each other, and prime95 is "loosely" derived from lucdwt. Lucdwt and MacLucasUNIX are identified as the common ancestors of cllucas, glucas, and CUDALucas. So confirming new finds by prime95, CUDALucas, Mlucas, and gpuowl is close to as unrelated as possible. Substituting gpulucas for CUDALucas, or adding gpulucas, might help slightly. (I haven't attempted thoroughly reviewing and comparing the source codes for degree of commonality. Haven't run gpulucas.) Quote:
https://www.mail-archive.com/mersenn.../msg07476.html shows LL test error rates used to be higher than they are now averaging. Quote:
Quote:
Quote:
We know from some gpu runs that some bugs/misconfigurations will preferentially stabilize on a specific wrong res64 result, not a random wrong one. One such value is a false positive, as Madpoo has long known and dealt with. So that's an existence proof of nonrandom result from error, that occurs despite nonzero offset. A patch to detect and halt such runs was added. (See item 4 in the CUDALucas bug and wish list attached at https://www.mersenneforum.org/showpo...24&postcount=3) One promising take on the recent history of discoveries is we've been the beneficiaries of catching up to the expected number. See George's post earlier. https://www.mersenneforum.org/showpo...&postcount=204 Last fiddled with by kriesel on 2018-12-19 at 17:34 |
|||||
|
|
|
|
|
#136 |
|
Mar 2016
34710 Posts |
A peaceful and pleasant evening,
I am lucky about the new mersenne prime, very just in time for Christmas and congratulation for all, who have made possible this success. Greetings ![]() Bernhard |
|
|
|
|
|
#137 | ||
|
Serpentine Vermin Jar
Jul 2014
CF116 Posts |
Quote:
Quote:
We had another false positive come in from a cudalucas 2.05.1 version just a couple days ago (in a DC range). When we see those, once we see the version is the "bad" one we just ignore it. Far too common, sadly, to spark our interest. That version is the version that cried wolf and now I simply refuse to believe it anymore. Truly, anyone still using that version of cudalucas had better update because if you ever did find a prime, nobody would believe you until it was double-checked on a non-buggy version. LOL As far as getting some repeatable, but false, final residue that looked more believable, not "2" or "0", I'm drawing a blank on how that could happen. The only thing that comes to mind is a repeatable flaw in the add or mul that consistently gives the wrong result... something like the old Pentium fdiv bug, on the hardware side, or somewhere in software. The software side with it's selection of the best FFT size is always doing checks to make sure there aren't any roundoffs, and other issues that are actually hardware related (overclocking that leads to memory issues, as an example) are random in nature. I don't think I'm saying anything too shocking when I say that *all* incorrect residues that show up are the result of hardware... memory in particular, although knowing if it's CPU or memory is a little bit of a guess. Even if it's CPU related, it's probably because of an overheated/overclocked CPU giving bad L2/L3 cached data. In my experience, systems with ECC memory have NEVER had issues. This goes for the systems I used to check with as well as the accounts that use AWS instances, or other people that I happen to know use servers with ECC. We're talking zero errors, and that's pretty impressive. In fact I think it'd be pretty cool if George added something to the code so that when the system stats are reported (things like CPU type/speed) it also includes whether ECC memory is installed. I'm not sure if that's something that's generally available via a standard WMI query or not, but maybe... so it would be model agnostic to get such a datapoint. It's also interesting to see how a particular machine may "degrade" over time. They start out with great results, no problems, but as that same machine keeps working over the years, it gets a little flaky and starts spitting out bad residues. I imagine the owner of that system notices other problems because in general those systems don't stick around too long. They probably start blue-screening here and there, and I'm guessing it's when memory goes bad. Anyway, point being, bad residues occur, and as far as I can remember from looking at error rates from particular code or versions of software, nothing really stood out. Getting a repeatable error is possible, I suppose, although my gut tells me it's highly unlikely aside from a flaw in the CPU itself like that fdiv one I mentioned. |
||
|
|
|
|
|
#138 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
9,767 Posts |
Quote:
When the kit can't do a DC reliably, it's time to put the "evil eye" on it. And then move mission critical services onto another server... I can't even begin to guess how much time and money running GIMPS has saved me over the last ~20 years.... Last fiddled with by chalsall on 2018-12-19 at 21:39 Reason: s/much much time/how much time/; # Words have meaning, humor is subjective |
|
|
|
|
|
|
#139 | ||
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Quote:
As far as all erroneous residues being due to hardware, that seems from my testing clearly not the whole picture. There are ways to generate bad residues at will on hardware that is very solid otherwise, by using a particular CUDA level or threads setting in CUDALucas, exceeding the capabilities of a gpu in CUDAPm1 so stage 2 residues repeat every output, etc. Quote:
|
||
|
|
|
|
|
#140 | |
|
"Forget I exist"
Jul 2009
Dumbassville
26·131 Posts |
Quote:
|
|
|
|
|
|
|
#141 |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
9,767 Posts |
|
|
|
|
|
|
#142 |
|
Serpentine Vermin Jar
Jul 2014
3,313 Posts |
Here's a quick count of "funny" residues that have a lot of zeroes. Note there are WAY more with a residue of zero than are genuinely prime. There are a few others that have a somewhat suspicious amount of leading zeroes, but that may actually be the case. I mean, technically, since these are only partial residues, we could find one where the last 64-bits really are zero, by chance, but the rest of the residue is non-zero. That could actually be the case here and there because I have seen some funny results (same user who submitted a lot of false positives) where it said "xxx is composite" but the partial residue was all zero... that machine of his definitely had an issue though.
Which, by the way, is why double-checks should *always and only* be done by a different user. If they have a flaky system that can generate repeatable, but wrong, results, then we need *independent* double-checks to catch those things. And which is why, even though I have limited capacity now, I still hunt those down and do an independent check of my own. Anyway, here's that list: Code:
Residue Count 0000000000000000 311 0000000000000002 14 0000000000000003 1 0000000000000004 1 000000000000006C 1 0000000000000269 1 |
|
|
|
|
|
#143 | |
|
Sep 2003
5×11×47 Posts |
Quote:
You fell victim to it yourself and reported it and George replied later in the same thread and fixed it. These results (and others with fewer leading zeros) were retroactively accorded "good" status. The 0000000000000003 result should also be changed from bad to good, since it's clearly a victim of the same thing (shift count was 162887 for exponent 162889). The 0000000000000004 result, on the other hand, had a shift count of 0, so it was a genuine bad result of some kind. PS, I don't know if it's possible to attach some kind of comment field to a database record, to explain for posterity why the status is set to good despite having a non-matching result. Below is the list of 64-bit residues marked good despite not being matching values. As I mentioned, Eivind Triel's result for 162889 should be added to it. The Myrmans I think had a bug in their residue-printing code, not sure about the others. This doesn't include all the residues of other-than-64 bits from Slowinski and others from the pre-GIMPS era, which are also marked good. Code:
8291,MadPoo,Manual testing,0000000013773232,,2015-04-11 05:58 12281,MadPoo,Manual testing,00001808E5C18969,,2015-04-11 05:57 731767,MadPoo,Manual testing,00000001F7189369,,2016-06-01 02:40 801883,MadPoo,Manual testing,00000001C201CCBD,,2015-04-11 15:52 1048507,Ernst W. Mayer,,B746F5680BFB3B27,, 1373483,Alex and Nick Myrman,,64F5D369DBA07793,, 1373501,Alex and Nick Myrman,,ABF49AB2AF916CDF,, 1373531,Alex and Nick Myrman,,AF8EF844F9474831,, 1373683,Alex and Nick Myrman,,30BFB9C3AEFC441F,, 1397983,Dwayne Towell,,F427ECDEE6EC189D,, 1397989,Dwayne Towell,,5A05B3487791000E,, 1398017,Dwayne Towell,,49A8806CB134DB28,, 1398049,Dwayne Towell,,3F8A270790967635,, 1398079,Dwayne Towell,,0BA8DA1E02FB903A,, 1398083,Dwayne Towell,,91A8568BCC0F0D00,, 1398427,Dwayne Towell,,F565D2176BFC2F4C,, 1398451,Dwayne Towell,,4803B9D97D901714,, 1398473,Dwayne Towell,,D2E7AD3ABBAA2148,, 1560953,Alex and Nick Myrman,,C3540B0D6FBB4034,, 1563277,Alex and Nick Myrman,,2C2F1357AA6986E7,, 1586707,Alex and Nick Myrman,,63ADFEF11B443BB2,, 1611131,Alex and Nick Myrman,,3C781C2CCDD1EF08,, 1653739,Alex and Nick Myrman,,D58C0B0E00C9B49E,, 1666387,Alex and Nick Myrman,,D8777EE4DA90A7F8,, 1666397,Alex and Nick Myrman,,091E42E39EFA24A9,, 1666409,Alex and Nick Myrman,,3AB476FA9FB0362A,, 1709377,whanlon,,353F2707EB55CB86,, 1798127,Alex and Nick Myrman,,71DAB0C3B7949AA0,, 1798409,Alex and Nick Myrman,,37E913D0172C6EF9,, 1874527,Alex and Nick Myrman,,94D1F78019D07BB8,, 1874797,Alex and Nick Myrman,,47DF9C1D4DAAF8B4,, 1874953,Alex and Nick Myrman,,CD1BFC19E87C597D,, 1907593,Alex and Nick Myrman,,1C2D318920D7D67D,, 2070071,Rob Hooft,,F8E2468605C42010,, 2070083,Rob Hooft,,8B2E7F39A8A4A724,, 2401579,Rob Hooft,,6D9ACDBF5B2E816D,, 2403971,Alex and Nick Myrman,,A53C50800A4494AE,, 2414081,Rob Hooft,,4D51566403FBF53E,, 2414099,Rob Hooft,,8370E0C747FC7D96,, 2414129,Rob Hooft,,B96148394C745EF6,, 2528969,Alex and Nick Myrman,,1C6CB11AB95BDF35,, 2816873,Alessandro Castagnini,,71E1061E98A2F864,, 2816927,Alessandro Castagnini,,6F69C92D75F22C25,, 2816941,Alessandro Castagnini,,3F6F56D3C63134DB,, 2816963,Alessandro Castagnini,,507DA633D4529624,, 3739199,Brian J. Beesley,alfalfa,12EF2F0EC9B15ADE,, 4059563,Phil Forrest,,754EA2C4ECF29DA6,, 4163539,whanlon,,DFE9112E5A56E23C,, 5643571,David Willmore,wopr,EFE193C46F4AC00D,, 5896207,David Willmore,wopr,2358E6E47DABA27F,, 37830997,Andrew Daniels,AHDHome,0000000000000269,,2008-02-03 00:00 Last fiddled with by GP2 on 2018-12-20 at 06:44 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Lucky ECM results | mnd9 | Data | 1 | 2020-02-04 02:32 |
| Fun with the Lucky Numbers of Euler | ewmayer | Probability & Probabilistic Number Theory | 0 | 2015-10-18 01:37 |
| Extremely lucky assignments | apocalypse | GPU to 72 | 6 | 2015-04-07 04:41 |
| Lucky ECM hit | Dubslow | Factoring | 3 | 2014-10-19 19:10 |
| Lucky gmp-ecm curve... | WraithX | GMP-ECM | 4 | 2009-01-12 16:29 |