mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Data (https://www.mersenneforum.org/forumdisplay.php?f=21)
-   -   Newer milestone thread (https://www.mersenneforum.org/showthread.php?t=13871)

Madpoo 2015-02-20 18:32

[QUOTE=tha;395825]Hmmm? As far as I know this number should never be able to go up, unless it is a correction of an error. Certainly new assignments or expirations should not be able to let this number go up.[/QUOTE]

Oh, you're right...at the time I was updating some info there and thinking about unassigned exponents so I got that stuck in my brain.

But yeah, it could be that some double-checks weren't matching resulting in the need for triple-checks. But that'd be a heckuva lot.

I just checked, and there are 1,733 exponents up to 37156667 that need triple-checking.

Speaking of, I kind of thought it would be cool to have another assignment type available - "triple checking". Admittedly the available exponents for that type would be limited, but for exponents where a first and second time check didn't match, I think it'd be nice to know sooner rather than later which one was right. Otherwise I think the exponent goes back into the general pool for double-checks and it may still be a while before the double-check wavefront catches up.

In my head I had an idea to see if there's a correlation between certain users/computers and bad results. The more triple-checks get done, the more data to work with... it came up because I was doing some specific triple-checking and noticed that the "loser" in some I was checking were all from the same user...enough so that I thought it might even make sense to look closer at others by that person and re-test others earlier even if they weren't already flagged as "suspect". But then my sample set was only about 3-4 triple-checks so I may be reading too much into it. :smile:

EDIT: On reflection, some of those 1,733 may have reported an error during the first time-check but no second check has actually been done yet. When a double-check comes in, it may actually match. I could change my query to account for that but I'm too lazy right now. :)

EDIT #2: I guess I'm not that lazy after all... it really is 1,733 needing triple-checks. That of course could change if some that haven't been double-checked yet result in the need for a triple-check. I even found a handful that need quadruple (or more) checks. Most or all of those seem to result from some duplicates from the v4 migration, where the same suspect result is showing up multiple times, so their matching but suspect result is showing up multiple times. I'm testing one or two to find out for sure.

tha 2015-02-20 18:48

As far as I understand it the number of 'Countdown to proving .... is the ... Mersenne prime' numbers should go down only when a matching double check is in. So the need for a triple check or anything else should not relate to these numbers. I'd like to hear if the implementation is anything else.

TheMawn 2015-02-20 19:09

[QUOTE=Madpoo;395929]
Speaking of, I kind of thought it would be cool to have another assignment type available - "triple checking". [/QUOTE]

Or perhaps a more general case of "Suspicious Results?" I actually really like that idea.

If there ever was any value in saying that "Every Exponent up to XX,XXX,XXX has been tested at least once," then whatever value there was completely evaporates when we realize that however many hundred results have error codes.

Frankly, getting the suspicious results cleaned up asap is of great value to anyone wondering about the integrity of their hardware.


My recommendation would be that the reliability and confidence of a CPU should be high for it to be assigned suspicious results. Further, the reliability of the CPU should not be affected as harshly when it returns a mis-matched residue for "Suspicious Results" work.

Madpoo 2015-02-20 19:25

[QUOTE=tha;395934]As far as I understand it the number of 'Countdown to proving .... is the ... Mersenne prime' numbers should go down only when a matching double check is in. So the need for a triple check or anything else should not relate to these numbers. I'd like to hear if the implementation is anything else.[/QUOTE]

Well crumb, you're right. I had to look at where it's getting that count from (I didn't write it, so I wasn't terribly familiar except in passing).

Basically, it's a simple count of how may exponents below that one are currently marked as "unverified" in the database.

Things that can make a result go from "unverified" to something else are:
- A factor is found for it, in which all previous LL results for that exponent are changed to a "factored" state
- A successful double-check, and both the 1st/2nd checks are marked as "verified"
- A double-check comes in that doesn't match -- I'd have to check the code to see what happens there, but one or the other could be changed to "unverified / suspect" (of which there are those 1,733 currently)
- A matching triple-check comes in... the odd result(s) out gets marked as "bad" and the matching pair get marked as "verified"
- A mismatching triple-check comes in which is basically treated the same as a mismatched double-check...i.e. some may get marked as "suspect".

So... yeah, the only reason for that number to change is if the number of "unverified" results went up, which it shouldn't really. The only reason the # of unverified results would increase is if a double-check came in and somehow that new one *and* the old one both got marked suspect for some reason. I don't know how the code handles those situations though... like, does only a result with certain types of error codes get marked suspect? Or what if two seemingly error free runs have a mismatch, how would it know which one it thinks is suspect, or does it just consider them both as "unverified"?

And yes, the SQL query does a "distinct" clause so it's not counting the same exponents multiple times in case of that. :)

Anyway, this does deserve a little digging I suppose, although I wouldn't deem it critical...but it is curious.

Madpoo 2015-02-20 20:03

[QUOTE=Madpoo;395939]Well crumb, you're right. I had to look at where it's getting that count from (I didn't write it, so I wasn't terribly familiar except in passing)....
Anyway, this does deserve a little digging I suppose, although I wouldn't deem it critical...but it is curious.[/QUOTE]

Turns out this has a relatively mundane explanation. The numbers I was looking at came from my mocked up milestone page with some additional info.

And, not only that, but the count of checks to finish up double-checking the known Mersenne numbers is done in an unusual (to me) way, where it's taking the result of the previous count and adding it to the next one.

Why that matters is that when we finished double-checking M44 recently, the code that tabulates M45's countdown was still adding in the result of the previous query on the page, which happened to be using the same variable name.

Oops.

Once I fix that you should expect to see the countdown for M45-M48 drop by whatever the previous countdown number was on the page. For the normal milestone page, that means it'll drop by 386 since that's the 34M double-check count right now. For the mocked up page that had a countdown to first-time checking to 58M, the difference was even higher.

That also cleared up a mystery for me in the Mxx countdowns since the individual queries were only counting exponents between it and the previous prime, not the total from the "double-check up to" number. That may make the SQL query marginally faster for each one but when I was looking at those things just now it did have me wondering, thus I noticed the incrementing counter.

Enjoy the fixed countdown for M45 and beyond...sorry for the muck up. :smile:

Madpoo 2015-02-20 21:42

[QUOTE=TheMawn;395937]Or perhaps a more general case of "Suspicious Results?" I actually really like that idea.

If there ever was any value in saying that "Every Exponent up to XX,XXX,XXX has been tested at least once," then whatever value there was completely evaporates when we realize that however many hundred results have error codes.

Frankly, getting the suspicious results cleaned up asap is of great value to anyone wondering about the integrity of their hardware.


My recommendation would be that the reliability and confidence of a CPU should be high for it to be assigned suspicious results. Further, the reliability of the CPU should not be affected as harshly when it returns a mis-matched residue for "Suspicious Results" work.[/QUOTE]

There are some interesting things I'm seeing, regarding some users and a very high propensity for bad results.

I wrote a query and limited it to accounts that have returned at least 100 LL results (to avoid some that have only returned a handful that were all bad, thus a very high percentage of bad/good results).

There are 1027 accounts that have returned at least 100 LL checks and at least one of them wound up being bad. That's not terribly surprising for some of the prolific accounts like CurtisC which has 109,771 total and 63 bad ones. That's an awesome measure of quality of their work, just 0.06% error rate.

On the other end of the scale we have a user who has checked in 121 results of which 80 were bad (66.12% failure rate).

The most prolific bad results came from a user where 194 of their 1010 total were bad. Lower percentage-wise at 19.21%, but that's still 194 exponents that needed a triple-check somewhere along the way.

I didn't look at v4 results or accounts since that's a little different to group together, but this was interesting enough.

TObject 2015-02-20 21:58

It may actually be better to look one level below, at user computers. For example, I know I have some top-notch enterprise level machines with ECC memory, and at the same time I have some crappy desktops, that reboot all the time (I actually took the latter category off-GIMPS, but I still have some better desktop grade machines crunching).

Furthermore, one of my computers developed a memory problem; and the fact that I started to receive Prime95 error codes, is how I became alerted to it. Replaced the memory, and the computer has been crunching error-free since then.

Anyway, my point is, this sort of statistic may be more useful when tied to a computer rather than a user.

Madpoo 2015-02-20 22:45

[QUOTE=TObject;395949]It may actually be better to look one level below, at user computers. For example, I know I have some top-notch enterprise level machines with ECC memory, and at the same time I have some crappy desktops, that reboot all the time (I actually took the latter category off-GIMPS, but I still have some better desktop grade machines crunching).

Furthermore, one of my computers developed a memory problem; and the fact that I started to receive Prime95 error codes, is how I became alerted to it. Replaced the memory, and the computer has been crunching error-free since then.

Anyway, my point is, this sort of statistic may be more useful when tied to a computer rather than a user.[/QUOTE]

I suppose so...

If I group by computer instead of user, there are some pretty bad ones. Worst is a computer with 109 out of 152 bad results (71.71%).

I wonder if people with such a bad track record are actually aware of it? Perhaps not since they might not realize their results are bad until a triple-check much later. Unless they look at their account results periodically and see just how many are bad, they wouldn't know.

chalsall 2015-02-20 22:45

[QUOTE=TObject;395949]Anyway, my point is, this sort of statistic may be more useful when tied to a computer rather than a user.[/QUOTE]

+1! (One factoral... :wink:)

Madpoo 2015-02-20 23:17

[QUOTE=Madpoo;395951]I suppose so...

If I group by computer instead of user, there are some pretty bad ones. Worst is a computer with 109 out of 152 bad results (71.71%).

I wonder if people with such a bad track record are actually aware of it? Perhaps not since they might not realize their results are bad until a triple-check much later. Unless they look at their account results periodically and see just how many are bad, they wouldn't know.[/QUOTE]

A little more detail on that particular bad computer...

- Of the 109 bad results, 56 of those had a zero for the error code.
- Of the 43 non-bad results:
- 4 are still unverified, awaiting a double-check
- 17 are verified okay (double-check matched)
- 21 are suspect - some error code, but they haven't been double-checked yet
- 1 had a factor found later, so who the heck knows, but there were 2 LL mismatched LL tests done...I have my guess which one was bad :smile:

This particular computer checked in it's last result in July 2012, although the user was active with other computers up to May 2014 with a much better track record.

EDIT: As you might guess, the 25 exponents this computer checked in that haven't been double-checked yet should be considered highly suspect... if it were me I'd figure out a way to bump these up the priority list so they're double-checked earlier than usual. Doing that on some grand scale where it takes a computer's failure rate into account would be pretty cool, but making that happen could be...interesting...from a coding point of view.

Madpoo 2015-02-20 23:32

[QUOTE=Madpoo;395954]

EDIT: As you might guess, the 25 exponents this computer checked in that haven't been double-checked yet should be considered highly suspect... if it were me I'd figure out a way to bump these up the priority list so they're double-checked earlier than usual. Doing that on some grand scale where it takes a computer's failure rate into account would be pretty cool, but making that happen could be...interesting...from a coding point of view.[/QUOTE]

FYI, it seems like they get marked suspect when a double-check runs and it didn't match... the one with an error code seems like it's marked suspect? I don't know...just guessing.

So really of those 25 unverified or suspect results, 21 of those have been double-checked and now need a triple-check. Only 4 have never been double-checked. I could share what those 4 are but then you'd see whose account it is I'm talking about. :smile:


All times are UTC. The time now is 23:16.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.