mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
Thread Tools
Old 2016-02-25, 19:04   #2311
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CF116 Posts
Default

Quote:
Originally Posted by ATH View Post
IThe reason is it not exact is that it is only calculated once per day
Derp, that'd be right. I'm sure if I included the "cat 1" results that came in since midnight it would roughly fill the gap.
Madpoo is offline   Reply With Quote
Old 2016-02-25, 19:08   #2312
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

293010 Posts
Default

Quote:
Originally Posted by Madpoo View Post
It wouldn't be a terrible idea to just automatically assign these cat 1 & 2 things to systems who meet certain criteria even if they haven't explicitly checked the box. If that happened I'd definitely exclude any systems that have had assignments expire in the past however many months.

I suppose if something like that worked well, we could just get rid of that "opt in" altogether.
I think that's a good idea.
Mark Rose is offline   Reply With Quote
Old 2016-02-25, 19:53   #2313
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·67·73 Posts
Default

Quote:
Originally Posted by Madpoo View Post
Now... why is this? Is it a function of not enough people signed up to get the smallest exponent (and/or not meeting the other requirements like reliability and days-of-work in their settings)?
I suspect this is it -- there are many capable machines owned by people who didn't click the "promise" button, and thus are being given cat 3.

Unfortunately, some who _did_ click the button perhaps shouldn't have...

Quote:
Originally Posted by Madpoo View Post
It wouldn't be a terrible idea to just automatically assign these cat 1 & 2 things to systems who meet certain criteria even if they haven't explicitly checked the box. If that happened I'd definitely exclude any systems that have had assignments expire in the past however many months.

I suppose if something like that worked well, we could just get rid of that "opt in" altogether.
I think this would be a really good idea, realizing there would be some coding work and testing involved...

It could be argued the whole "you have to promise" was experimental in the beginning, to avoid people being caught off guard and having their assignments be reassigned sooner than they were used to (at the time, effectively assignments /never/ expired!).

Perhaps now that things have settled to a relatively stable state, we can look at the ~30 Cat 1s being completed a day compared to ~170 Cat 3s and conclude that more Cat 1s should be assigned to those who haven't "promised".

Machine performance heuristics would be key. And, at the end of the day, if a few LLs get reassigned the worst that happens is few early DCs are done. The project doesn't lose any throughput.

Last fiddled with by chalsall on 2016-02-25 at 19:57 Reason: s/by someone/by people/
chalsall is online now   Reply With Quote
Old 2016-02-25, 23:39   #2314
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

How many cat 3 assignments are manual assignments?
owftheevil is offline   Reply With Quote
Old 2016-02-26, 16:42   #2315
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·67·73 Posts
Default

Quote:
Originally Posted by owftheevil View Post
How many cat 3 assignments are manual assignments?
That would be a /very/ interesting statistic. I would also be interested in knowing how many manual assignments are in the cat 4 range, particularly true "anonymous" assignments.

To put on the record (again), I still think it is silly that some random person can surf by the Primenet website, and reserve many (sometimes MANY!) candidates which almost certainly will never even be started let alone completed, but are then held up for 180 days.
chalsall is online now   Reply With Quote
Old 2016-02-26, 20:23   #2316
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

63618 Posts
Default

Quote:
Originally Posted by owftheevil View Post
How many cat 3 assignments are manual assignments?
Hmm... that would have been interesting, unfortunately I don't see a way to know whether assignments were done through the manual assignments page or by a client automatically.

On results there's a way to tell, but not assignments.

EDIT: Oh wait, there is a way to tell, kind of... the computer that got the assignment will always have a name of "Manual Assignments"... So, okay, there is a way, just an extra join on my lookup. Stay tuned, I'll see what I can find.

Okay... (for first time LL work):
Originally cat 3 assignments = 7,893 of which 214 were manually assigned (none from truly anonymous users)
Originally cat 4 assignments = 24,178 of which 6244 were manually assigned (2399 of those from truly anonymous users)

Last fiddled with by Madpoo on 2016-02-26 at 20:38
Madpoo is offline   Reply With Quote
Old 2016-02-26, 20:32   #2317
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by Madpoo View Post
Hmm... that would have been interesting, unfortunately I don't see a way to know whether assignments were done through the manual assignments page or by a client automatically.
How do the public reports know to show "Computer: Manual testing" under the Userid column popup for some (but not all) Anonymous (and other) users?

Wouldn't the same conditional apply? And I think owftheevil and I are talking about currently assigned, not completed.

Edit: Whoops. Cross-post.

Last fiddled with by chalsall on 2016-02-26 at 20:33
chalsall is online now   Reply With Quote
Old 2016-02-27, 03:24   #2318
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

331310 Posts
Default

By the way, this is some inside-baseball stuff (sorry, stupid American term), but I'm just shaking off the dust from my brain after looking into the whole "confidence level" and "reliability index" that Primenet tracks for each CPU.

Before I lose that train of thought (and since I'll likely forget it, so I'll "archive" the knowledge here)...

Basically Primenet tracks two things for each CPU to help determine how well it's doing. A reliability index between 0 and 1, and a confidence level between 0 and 10.

A new computer starts out with a confidence level of zero (because it hasn't done anything), but a reliability index of 1 (because we start out assuming it will only ever return good results).

When a result is returned, there is a temporary reliability value set of:
  • Matches a previously verified result = 1.0 (new result matches the already confirmed residue)
  • Does not match a previously verified result = 0.0 (mismatches the confirmed residue, so we know it's bad)
  • Unverified and "clean" = 0.98 (no critical errors reported during the run)
  • Unverified and "suspect" = 0.5 (had some error that makes it suspect)

This temporary reliability value is for just this one result and it's used to help adjust the overall reliability and confidence stats.

The new reliability is a function of:
(old value * confidence + temp reliability) / (confidence + 1)

If we take the example of a new system where the reliability is assumed to be 1, confidence is zero, and they just returned a clean unverified result, we plug those in as:
(1 * 0 + 0.98) / (0 + 1) = new overall reliability of 0.98

The confidence value gets adjusted as:
if confidence < 10 then confidence = confidence + 1

So... yeah, as near as I can tell, the confidence level goes up whenever you turn in a result, up to a max of 10, no matter what. Suspect, clean, verified, unverified, bad...

Going back to our example, this user returns a second clean result, so we get:
(0.98 * 1 + 0.98) / (1 + 1) = new overall reliability of 0.98

So as long as they keep returning clean unverified stuff, they'll stay at 0.98 for their reliability... all good. And weird because the only way it can go higher is if you're doing triple-checks of already-verified results (hey, been there, done that). And yes, there are systems with a perfect 1.0 reliability which could only come from *only* doing triple-checks and always matching.

It's only if you start turning in suspect results or if you (why?) turn in a mismatched residue for an already verified result that you'll dip below 0.98. And yes, it happens.
e.g. M35498293

It's important to note that the reliability is apparently not affected when you have an old result, and someone turns in a result later on that confirms yours as either good or bad (which should use a value of 1.0 or 0.0 for that formula if it were to do so).

To get a preferred first time LL test, you need a confidence level of 2 (at least 2 results returned) and a reliability of 0.9

In reality that's only going to weed out systems that have returned 2 at all, since the CPU entry was added, and as long as they haven't had any suspect results, or turned in a lot of clean stuff to offset any suspect.

There is also a CPU speed requirements of 2GHz or better, plus the already known limit on how many days of work can be in your queue (<= 10 days), and the 2 results have to be recent.

When I look at computers that meet all the normal criteria for getting first time checks, but also include how many expired assignments or how many bad results they've had in the past 120 days, it paints an interesting picture.

One system, for example, is still active (last seen 1 day ago), has 11 recent results, confidence level of 10, reliability of ~0.98, good CPU speed.

But by my additional metrics... they also have 3 expired assignments in that timeframe, their CPU rolling average is a pretty low 667 (probably from running 4 workers on 4 cores, which is safe to say is inefficient at 4M FFT).

Another example... system meets all the requirements. Reliability is a bit lower at 0.94, and that makes sense because in the past 120 days they've had 14 results bad, so I'd assume they turned in some suspect results in that time as well (that user has been doing doublechecks, so no real risk right now to the LL leading edge).

Basicaly, of the 728 CPUs (not users... cpus) that could get cat 1 LL right now, I'd exclude 48 of them based on having recent expirations or recent bad results.

If I were to automatically opt-in systems, there would be 5179 eligible under the existing rules, but if I weed out the ones with recent expirations or bad stuff, then subtract 294 from that.

But forget the current rules... I looked for machines that are even more awesome:
  • 3 results in the past 90 days (one a month)
  • 10 days in the work queue (same as now)
  • Rolling average of 1500 or higher (no strangely slow systems)
  • No expired work in the past 90 days
  • No (known) bad results in the past 90 days (doesn't help if they're fast but buggy)
  • Seen in the past 30 days (still active... this is just for estimating the pool of eligible systems)

So I didn't even look at the confidence or reliability values for this... I found it better to estimate based on the past 90 days worth since the reliability doesn't take into account any results that were later proven one way or the other.

With those rules I've whittled it down to 595 cpus, but these are systems that I could be fairly sure are going to crank away at the work and not let them hang.

If I bump it to 120 days of history and only 2 results in that time, it's 1069 cpus. Or if I then also set the min rolling average to 1000, the pool grows to 2033 systems.

In my opinion, rolling averages below 1000 indicate a system that really isn't running optimally. Just my 2 cents.

So... that's my investigation in a very large nutshell and I wanted to share the results.
Madpoo is offline   Reply With Quote
Old 2016-02-27, 15:31   #2319
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by Madpoo View Post
So... that's my investigation in a very large nutshell and I wanted to share the results.
_Very_ interesting! Thanks for the look under the hood.

Your various heuristical metrics as applied to the current empirical dataset are also informing. Clearly Cat 1 could easily be more appropriately serviced if one of your metrics were applied to the assignment code.

Edit: Just thought of something... Perhaps implement something like the "3 results in 90, rolling average of 1500" metric for Cat 1 and the "2 results in 120, average of 1500" metric for Cat2? Handy because the rolling average variable is easy to tweak over time.

Any internal discussion as to if this will be implemented?

Last fiddled with by chalsall on 2016-02-27 at 15:46
chalsall is online now   Reply With Quote
Old 2016-02-27, 16:57   #2320
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

63618 Posts
Default

Quote:
Originally Posted by chalsall View Post
Any internal discussion as to if this will be implemented?
Right now this is just me taking a look to see what the options are. If it seems like these new criteria would be useful, my next step would be to just create a SQL function to spit out a result on whether the CPU is good for cat 1, cat 2, etc. Then that could be tied back into the assignment stuff as either supplemental or replacing the part that lets it know if it's eligible for preferred assignments.
Madpoo is offline   Reply With Quote
Old 2016-02-27, 17:04   #2321
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

Quote:
Originally Posted by chalsall View Post
_Very_ interesting! Thanks for the look under the hood.

.....

Any internal discussion as to if this will be implemented?
No real discussion here. I just want to compliment the most thorough explanation of certain metrics in P95 that I have ever seen.

Thanks, Aaron!
kladner is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newer X64 build needed Googulator Msieve 73 2020-08-30 07:47
Performance of cuda-ecm on newer hardware? fivemack GMP-ECM 14 2015-02-12 20:10
Cause this don't belong in the milestone thread bcp19 Data 30 2012-09-08 15:09
Newer msieves are slow on Core i7 mklasson Msieve 9 2009-02-18 12:58
Use of large memory pages possible with newer linux kernels Dresdenboy Software 3 2003-12-08 14:47

All times are UTC. The time now is 23:29.


Fri Aug 6 23:29:24 UTC 2021 up 14 days, 17:58, 1 user, load averages: 3.84, 3.85, 3.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.