mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2010-09-29, 12:29   #23
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by Rhyled View Post
Sigh - because my denominator was only 3 out of the 4 categories. Stupid mistake.
Oh, I see. Now our numbers (for GHz-Days) match, as I'd expect.
Quote:
Originally Posted by Mini-Geek View Post
The deltas aren't being parsed for me. I didn't really care to try to use them, but I wasn't expecting it to be blank.
I noticed why, and it's stupidly simple: The deltas are in $6, and are saved to $Deltas. The output line only goes up to $5. So they're, basically, intentionally being ignored.
I also noticed that a lot of the code you included, while useful if you're planning to use the data in more Perl code, was useless and unused for me. Here's an updated version of my modification:
Code:
   open(IN, $ARGV[0].'.txt');
   open(OUT, '>' .$ARGV[0].'.csv');
   print OUT "Rank,Name,GHz-Days,Attempts,Successes\n";
   while (<IN>) {
      if (/^\s*(\d*)\s*(.*)\s+(\d+\.\d*)\s*(\d*)\s*(\d*)\s*\|(.*)/) {
         print OUT "$1,\"$2\",$3,$4,$5\n";
         $i++;
         if ($i % 500 == 0) {
           print "on line $i\n";
         }
      }
   }
I decided to exclude the Deltas completely, including the column header for it. Same usage as before.
Mini-Geek is offline   Reply With Quote
Old 2010-09-29, 13:56   #24
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
I also noticed that a lot of the code you included, while useful if you're planning to use the data in more Perl code, was useless and unused for me.
Yes, as I mentioned in my post. I left them in as they're useful if your going to process the data further in the script, and I thought it was also a good way of documenting what the regex extracted into what temporary variables.

Also, you'd correctly commented that this doesn't work on the "Totals Overall" report. For anyone who's interested, here's code for that report:

Code:
   if (/^\s*(\d+)\s*(.*)\s+(\d+\.\d*)\s*\|(.*)\|(.*)$/) {
      $Rank = $1;
      $Name = $2;
      $GHzDays = $3;
      $Deltas = $4;
      $Percentages = $5;
   }
Note that the $Percentages variable still needs to be broken down into the six possible values.
chalsall is offline   Reply With Quote
Old 2010-09-30, 17:32   #25
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Oddly, I found a relatively small, but still significant, discrepancy between the summed GHz-Days for the Overall report vs the sums of the individual reports. The individual reports sum to 7812823.76 but the overall report sums to 7868772.93, a difference of 55949.17 GHz-Days. That's about 0.7%. Anyone have a guess as to the reason?
A thought just came to me, which might explain this...

Did you run your analysis from a full dataset of each work type (and overall) ("Customize"... "End Rank" = 10000 results in 6705 records with GHzDays > 0.000 for the overall report as of right now, for example), or only the reports' default top 1000?

If the latter, this might explain what you observed. If the former, I have no idea....
chalsall is offline   Reply With Quote
Old 2010-09-30, 19:41   #26
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

10000101010112 Posts
Default

Quote:
Originally Posted by chalsall View Post
A thought just came to me, which might explain this...

Did you run your analysis from a full dataset of each work type (and overall) ("Customize"... "End Rank" = 10000 results in 6705 records with GHzDays > 0.000 for the overall report as of right now, for example), or only the reports' default top 1000?

If the latter, this might explain what you observed. If the former, I have no idea....
It was with all results, which is the default before you click Customize. When you click Customize, it changes to 1000. In checking that out, I just noticed the reason for the difference: I only took from the given links under Top Producers, but there's another category, visible under Customize: ECM on Fermat numbers! I guess I figured the ECM link included both, (or just forgot about ECM on Fermat) but it specifically says "ECM on small Mersenne numbers". When you click Customize, you get the option to see ECM on Fermat numbers. I'd have to rerun all the numbers to get a perfect record, but the current GHz-Days for the last year of ECM on Fermat numbers is 56187.37. That's a difference of just 238.2 from the last time I ran the report, which can probably be attributed to the recent work done. So I'd say it's almost certainly the only significant cause of the difference I observed.
So ECM on Fermat is about 0.71% of the total GHz-Days, which is just a little less than ECM on Mersenne.

Last fiddled with by Mini-Geek on 2010-09-30 at 20:03
Mini-Geek is offline   Reply With Quote
Old 2010-09-30, 20:04   #27
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·67·73 Posts
Default

Mini-Geek" "It was with all results, which is the default before you click Customize.

I'm not entirely sure you are correct here.

For empirical evidence, do all of the default queries provide more than 1000 records (other than, perhaps, ECM-F, which provides the full donation in less than 1000 records)?

If they don't provide more than 1000 records, then your claim you're working from the full data sets is clearly false.

Last fiddled with by chalsall on 2010-09-30 at 20:15
chalsall is offline   Reply With Quote
Old 2010-09-30, 20:08   #28
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

426710 Posts
Default

Quote:
Originally Posted by chalsall View Post
Mini-Geek" "It was with all results, which is the default before you click Customize.

I'm not entirely sure you are correct here.

For empirical evidence, do all of the data sets provide more than 1000 records (other than, perhaps, ECM-F)?

If they don't provide more than 1000 records, then your claim you're working from the full data sets is clearly false.
None contain exactly 1000, most more, some less. I'm quite sure it's not limited to 1000, or any other obvious number. Here are the counts, (from line counts of the text files, which equates to the number of users, not the rank all the ones with 0 credit tie at) just to clarify/verify:
All: 7629
P-1: 4430
TF: 4048
LL: 3066
DC: 2394
ECM: 458
ECM-F: 139
As you can see, only the two ECMs have under 1001 people. With the now-marginal difference, I'm pretty darn sure there's nothing else being missed.
Now that all the reports are showing as the 7:00 PM report, I can recalculate. I'll do that now and either edit or post, hopefully I'll now see exactly 0 unaccounted for.

Last fiddled with by Mini-Geek on 2010-09-30 at 20:12
Mini-Geek is offline   Reply With Quote
Old 2010-09-30, 20:29   #29
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Now that all the reports are showing as the 7:00 PM report, I can recalculate. I'll do that now and either edit or post, hopefully I'll now see exactly 0 unaccounted for.
Well, not exactly 0, but plenty close enough for my purposes: 0.03 GHz-Days apart this time! (7870898.66 in my sum vs 7870898.63 in the total report)

Here are the new GHz-Days percentages (out of all GIMPS work):
P-1: 3.68%
LL: 74.04%
DC: 11.08%
TF: 9.74%
ECM: 0.76%
ECM-F: 0.71%

And Attempts percentages (out of all GIMPS work):
P-1: 0.25%
LL: 0.14%
DC: 0.09%
TF: 98.29%
ECM: 1.19%
ECM-F: 0.04%

And something new: The ratio of Successes to Attempts in that category. This has a different meaning for each category, but still fun to compare.
P-1: 4.70%
LL: 0.00%
DC: 93.39%
TF: 2.32%
ECM: 0.59%
ECM-F: 0.01%

And just for the record: none of these categories 'just happened' to have 1000 results. They're all the full rankings. Also, this was all based off of the Sep 30, 7:00 PM hourly report.

Last fiddled with by Mini-Geek on 2010-09-30 at 20:30
Mini-Geek is offline   Reply With Quote
Old 2010-09-30, 22:57   #30
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001101102 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Well, not exactly 0, but plenty close enough for my purposes: 0.03 GHz-Days apart this time! (7870898.66 in my sum vs 7870898.63 in the total report)
Thanks for your work here Mini-Geek. It answers fully a question many have had.

The minor difference you've found working on the full publicly available dataset is probably explained by the fact that PrimeNet rounds all individual records to 0.001 GHzDays.
chalsall is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Top 500 rankings "going backwards"? daxmick Information & Answers 2 2017-12-19 07:39
Question About Rankings LiquidNitrogen Information & Answers 3 2011-07-02 19:51
"successes" in DC rankings? ixfd64 PrimeNet 5 2011-01-31 19:44

All times are UTC. The time now is 10:30.


Fri Aug 6 10:30:17 UTC 2021 up 14 days, 4:59, 1 user, load averages: 3.84, 3.78, 3.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.