20090115, 04:10  #12 
May 2007
Kansas; USA
5·2,017 Posts 
Max,
David and I are having an Email discussion about a formula for properly crediting results and primes. The formula that Karsten used is a decent one but I want one that is more "generalized" and isn't specific to an n=333333 prime. David, I'll get back with you later tonight or on Thurs. I'll come up with something that everyone can live with and will properly score CPU time taken. I've been offline most of the day replacing my mobo and Wednesday's are my night with my kids during the week. This is not to say that we can't keep the # of total results and primes AND have a total score too. You can have separate categories/listings. The top5000 site does that. Even if you're in the top 5 on primes, frequently you're not in the top 2030 in score because your primes are a lot smaller than some of the people that have found gargantuan primes. There's nothing wrong with that. It gives the folks with smaller resources a shot at moving high on the list for # of primes. Gary Last fiddled with by gd_barnes on 20090115 at 04:14 
20090115, 04:15  #13 
May 2007
Kansas; USA
5×2,017 Posts 

20090115, 04:40  #14 
I ♥ BOINC!
Oct 2002
Glendale, AZ. (USA)
3×7×53 Posts 
My goal is to add another column of stats that contains the score, in addition to everything else you already see when you visit your nplb database.
... And ... only if everyone wants it. I have no vested interest in how you guys want your database to work. My initial goal was to get the database up and running, I have achieved that. With AMDave's help with everything, you have what you see, and what you see is accurate to the best of our knowledge. We are trying to add some nice stats features, as we feel any database should allow one to view as much of the data that you want to see or use, or not. It is entirely up to all of you to decide what you would like to see, in what way can the data be presented to you that would help you with what you do behind the scenes, like Karsten for example. Is there a certain display that you could use to help make your job easier? Anyway, it's your stuff, you all decide... I'm just passing through and making good on my promise to take over the original AES database and make it better and more accurate. I've reached that goal completely. I've also been able to setup a super reliable llrnet Server for you guys to pound on /me toots own horn and walks away Thanks to AMDave for all his hard work that made it all come together for you fine folks to. /tips hat Last fiddled with by IronBits on 20090115 at 04:42 
20090115, 05:34  #15 
May 2007
Kansas; USA
2765_{16} Posts 
You have done an unbelieveable job for us David and I think I can speak for all of us in saying: THANK YOU!!
I think everyone will agree that a "score" column is needed. If anyone thinks that it isn't, please speak now. I need to go finish working on getting my final machine running now...may need to reload the O.S. due to a slightly different mobo. After that (likely a couple of hours), I'll post an official formula for you to use for the scoring. As indicated in the Email to you, it will be a little different for results than it is for primes. After that, I'll get the whole k=341 situation sorted out, move a quad over to sieve Nugget's range in the sieveing drive that he never sent me factors for (ARGH!!), and then, I'll be able to rest easier. lol Gary Last fiddled with by gd_barnes on 20090115 at 05:35 
20090115, 10:50  #16 
May 2007
Kansas; USA
10011101100101_{2} Posts 
I have thought all along that the top5000 site, which Karsten used as a basis for scoring our earlier drives was WAY WAY to complicated! It can be accomplished in a far easier way without logarithms AND it is still completely fair.
Many of you may know the following: The CPU time to process a k/n pair varies with the SQAURE of the nvalue; that is: If n=100K takes 15 secs. to process, n=200K will take ~60 secs. and n=400K will take ~240 secs. Put more simply, a k/n pair at n=400K will take 240 / 15 = 16 times as long to process as a pair at n=100K because the nvalue is 4 times as large and 4^2=16. The CPU time to find a prime varies with the CUBE of the nvalue; that is: If a prime at n=100K takes 20 CPU hours to find, a prime at n=200K will take 160 CPU hours to find, and a prime at n=400K will take 1280 CPU hours to find. Put more simply, a prime at n=400K will take 1280 / 20 = 64 times as long to find a prime at n=100K because the nvalue is 4 times as large and...4^3=64. For those of you who didn't realize it before, now you know why you were able to find primes so quickly and in such bunches on this drive. Now to the much improved easy formulas: Score results in the following manner: nvalue^2 / 1e10 Score primes in the following manner: nvalue^3 / 1e15 If the notation confuses some people, 1e10 = 10^10 and 1e15 = 10^15. It's as simple as that. David, when implementing this, please make sure there are plenty of digits of internal accuracy. For instance, for a prime at n=800000, the calculation will internally calculate as 800000^3/1e15=5.12e17/1e15=512. Also, please make sure that the scoring field itself is quite large on the results side. If someone has 1 million results and their average test is at n=500K, they will have 25 million "results" points. I would suggest at least 12 full digits for the field (goes to 999 billion) and possibly 15 (goes to 999 trillion) for results. For primes, I think it could be 3 digits less. This is so simple, it's child's play. A prime at n=100K scores 1 point, at n=200K scores 8 points, n=400K 512 points, and n=800K 4096 points. A result at n=100K scores 1 point, at n=200K scores 4 points, n=400K 16 points, and n=800K 64 points. If people object to such high total scores, we can always increase the divisor on the above but we certainly don't need to complicate things with logarithms. Karsten, Max, Mini, Ian, and any other folks who like messing with math, please state here if you disagree with this in any manner. David, Let's give it a day for any objections. If there are none, please proceed with adding a column to both your results and primes stats for this scoring method. I cannot think of a more simple way to fairly score things. Thanks, Gary 
20090115, 16:50  #17 
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
1000010101011_{2} Posts 
Sounds fine to me, but, and perhaps this is intended, but a side effect of scoring this way will mean that in high nranges, luck will have a very large effect on your overall score. Let's consider for a moment that if a very lucky new person has his first result return prime at n=1.6M, they will receive 8192 points, while a normal result at that size would be 128, giving them the equivalent credit for 8192/128=64 results. Do we want to give such a preference in scores to blind chance? (on thinking about this a bit more, I don't think it's too big of a deal, but I still want to bring the topic up)
The only thing I can think of to fix this would be to score primes no different from other results. (this would work just fine, but wouldn't give a nice bonus for actually finding what we're searching for) Maybe in the future, scores for manual results and sieving could be counted as well. Maybe count sieving by the size of the number sieved out and/or the size of the factor. Manual results/primes would, of course, be counted the same way as LLRnet results/primes. When somebody finds a prime at e.g. 1.6M (to reuse numbers from before), do they get 8192 points only or 8192+128=8320 for returning a result that happened to be prime? (i.e. is the normal result score given in addition to the bonus for finding a prime?) Also keep in mind that with just squarings/cubings to figure the score, the units will, in the future, inflate to very large amounts, and the work of today's computers will be a tiny, tiny drop in the bucket (as, indeed, they are)...do we want to try to do something to give higher credits the earlier it happened or simply let old work's credit shrink to a negligible amount? Edit: In regard to henry's post below mine, I think n=100K would be fine because it's a nice round number, and n=400K will seem as outofourrange/tiny as n=100K does now within a few years, and changing the value would make our algorithm more like top5000's where instead of absolutes we have constantly (ok, monthly) changing scores.. This reminds me of something else: when a result is returned, does the credit given round to an integer and then get stored as that, or does it continue out as a float, get added to the rest of the float, and then be rounded to the nearest integer only for display? It all being float in the backend would be much better for rounding accuracy. Last fiddled with by MiniGeek on 20090115 at 17:03 
20090115, 16:51  #18 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT)
2·3·941 Posts 
very good
one thing though: what do we want 1 to be? what you said would put it as n=100k but we dont do many tests at anywhere near n=100k so it is hard to pin a meaning to the value i would say we should possibly have it as something recognizable like n=400k or maybe the number of digits of the lowest top 5000 prime updated monthly 
20090115, 17:09  #19  
A Sunny Moo
Aug 2007
USA (GMT5)
3·2,083 Posts 
Quote:


20090115, 17:13  #20  
A Sunny Moo
Aug 2007
USA (GMT5)
3×2,083 Posts 
Quote:
I'm thinking that, rather than having an n=100K k/n pair be worth 1 point, we should have our 1point base value be at 400Kthat's a little closer to the general spectrum in which we do most of our work, so the scores should be a bit more manageable. Max 

20090115, 17:27  #21  
May 2008
Wilmington, DE
2×13×109 Posts 
Quote:


20090116, 01:12  #22 
May 2007
Kansas; USA
5×2,017 Posts 
Ref. Mini:
As Max said, the results and primes will be scored separately. The results scoring will technically be a more accurate reflection of total CPU effort expended because there is no luck involved. For the element of luck on huge primes, see the top5000 site. It's a virtual impossibility to move into the top 15 on score unless you get lucky, even with 100200 cores. There is an exception or two for institutions running 200500 cores at all times but for the most part, I seriously doubt that most of the people in the top 15 are running that many because most only have 1 monster prime, hence got lucky for the # of cores that they are running. That's the nature of primes and is why we will have separate scores for results and primes. Ref. all about making the score 1 for an n=400K prime: That sounds good to me. The only reason I used 100K initially is because we'll get into teeny fractions of a point for the lower nranges. But...as you guys said, those efforts here are not the norm. Revised formulas: Results: n^2 / 160e9 Primes: n^3 / 64e15 Therefore a prime at n=100K will score 1/64th (.015625) of a point and a prime at n=50K will score 1/512th (.001953) of a point. [lol] A result at n=100K will score 1/16th (.0625) of a point and a result at n=50K will score 1/64th (.015625) of a point. ...and of course a result or a prime at n=400K will score 1 point. David, the column will need to DISPLAY 12 digits to the left of the decimal point and 3 digits to the right, i.e. 999,999,999,999.999. As for internally storing (not displaying) the score of each result/prime, we probably need it out to 6 decimal places. This should not be a problem because each result/prime will almost always have a score of < 16 (an n=1M prime will score 15.625) so could each be internally stored in 999.999999 format. But when making that calculation on each result/prime, the numbers are large, i.e. an n=1M prime would be 1e6^3/64e15 or a 19digit number divided by a 17digit number. The division needs to be done so it is accurate to 6 decimal places. [With the scores much lower, 9 digits to the left of the decimal might be sufficient for displaying folk's total scores but we may as well plan for the longterm!] And finally: Combining the 2 scores would not make sense. Using the above, the results scores will be so much higher that they will overwhelm the primes scores so as to make such a combined score virtually meaningless. For example: If there is a 1 in 5000 chance of an n=400K result being prime, on average, a person will score 5000 results points and 1 prime point. Hence they must be kept separate. Gary Last fiddled with by gd_barnes on 20090116 at 01:20 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Distribution of Mersenne primes before and after couples of primes found  emily  Math  34  20170716 18:44 
ECPP  Scoring, or other primality tests (PFGW?)  f1pokerspeed  FactorDB  13  20120702 09:04 
Hoot discussion  "Beastly primes".  Arkadiusz  Math  12  20111128 15:52 
Statistics and scoring  kar_bon  No Prime Left Behind  85  20080919 02:02 
possible primes (real primes & poss.prime products)  troels munkner  Miscellaneous Math  4  20060602 08:35 