mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2010-09-27, 04:33   #12
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

11101001102 Posts
Default My own avatar!

Hey, I posted my reply to Mini-Geek and saw that I've received my very own avatar!

My sincere Thank You to the forum gods.

Rodrigo
Rodrigo is offline   Reply With Quote
Old 2010-09-27, 04:45   #13
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2·467 Posts
Default

Quote:
Originally Posted by Rhyled View Post
When I dug into it, I noticed that the first time primalty tests get the most GHz-Days, while the trial factoring gets the most results. No great suprises there. Here's the current distribution of assignments, and how many GHz-Days were spent over the last year per category
Code:
           TF   P-1   LL   LL-D  (Data retrieved 9-26-10)
GHz-Days     11%   4%   83%   12%
Assignments  57%   1%   31%   12%
Rhyled,

Very cool -- where did you find this information, or how did you derive it?

Quote:
Originally Posted by Rhyled View Post
I like P-1 work because it puts my i7 processor to serious use, and still give me results every day and a half or so. I got so tired of waiting 3 weeks just to see "LL done - not a prime". At least this way I get some timely feedback and the occasional semi-success (factor found).
I know what you're saying. If you're doing mostly LLs like I am, you're slowly drifting down in the rankings for several weeks, and then you suddenly leap up by hundreds of places.

Not that I planned it this way when I set them up, but my PCs have their assignments distributed such that over time the results will come in at a more steady rate. My little P233 notebook, doing TF work, gets me a little morsel of GHz credit on a daily basis, which helps to keep me going in-between the big feasts when an LL test is done.

Rodrigo
Rodrigo is offline   Reply With Quote
Old 2010-09-28, 00:49   #14
Rhyled
 
Rhyled's Avatar
 
May 2010

32·7 Posts
Default Citing Sources

Quote:
Originally Posted by Rhyled View Post
Code:
           TF   P-1   LL   LL-D  (Data retrieved 9-26-10)
GHz-Days     11%   4%   83%   12%
Assignments  57%   1%   31%   12%
I knew I should have cited my sources. The GHz-Days statistics come from the Top Producers - Totals Overall report. I backed out the GHz-Days for each type of work for each user, summed them up and normalized them. Spreadsheets are marvelous tools.

For the Assignment breakdown, I pulled the report from PrimeNet Summary - Work Distribution Map, added up the active assignments and renormalized those. The time frame of the two rows of statistics don't match up - one is a year, the other a snapshot, but I feel that the average work distribution won't vary significantly over the year. Besides, it's the only data I can get at easily without running a report for each type of data.

The spreadsheet is too large (1.5 MB) for the forum's 244 KB upload limit but I can email it to you if you wish. Just send me a private message with your email address if you've got a really bad case of insomnia you wish to cure.
Rhyled is offline   Reply With Quote
Old 2010-09-28, 03:38   #15
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2×467 Posts
Default

Quote:
Originally Posted by Rhyled View Post
The GHz-Days statistics come from the Top Producers - Totals Overall report. I backed out the GHz-Days for each type of work for each user, summed them up and normalized them. Spreadsheets are marvelous tools.
Rhyled,

Thanks for offering to send the spreadsheet, but it won't be necessary. I was mainly curious to learn how you came up with the figures, and you explained that well.

Chance to learn here -- Did you have to input the numbers yourself from those long columns of data, or is there an automated way to convert them into something that a spreadsheet program will read?

Rodrigo

Last fiddled with by Rodrigo on 2010-09-28 at 03:39
Rodrigo is offline   Reply With Quote
Old 2010-09-28, 03:47   #16
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts
Default

Quote:
Originally Posted by Rhyled View Post
Code:
             TF   P-1   LL   LL-D  (Data retrieved 9-26-10)
GHz-Days     11%   4%   83%   12%
Assignments  57%   1%   31%   12%
Not to my surprise, LL work gets the most cpu time, but trial factoring has the bulk of the assignments.
P-1 should actually account for more than 1% of the assignments because it's often done as the first part of an LL assignment that hadn't yet had the default P-1 performed. The 1% for P-1-only assignments doesn't include those, but the GHz-Days figure is based on P-1 result reports which include those generated from LL assignments. (That also explains P-1's misleadingly high 4-to-1 ratio of credit percentage to assignment percentage on that report.)

Last fiddled with by cheesehead on 2010-09-28 at 03:53
cheesehead is offline   Reply With Quote
Old 2010-09-28, 05:11   #17
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

32568 Posts
Default

Quote:
Originally Posted by Rodrigo View Post
Chance to learn here -- Did you have to input the numbers yourself from those long columns of data, or is there an automated way to convert them into something that a spreadsheet program will read ? Rodrigo
Most Spreadsheet programs have a feature to convert text to columns. In Excel 2003 f.i. you find that feature in the Data menu... There are also text handling programs in *nix OS that would do the trick. Another possibility is writing a program and using text handling functions...

Jacob
S485122 is offline   Reply With Quote
Old 2010-09-28, 12:00   #18
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by Rhyled View Post
The GHz-Days statistics come from the Top Producers - Totals Overall report. I backed out the GHz-Days for each type of work for each user, summed them up and normalized them.
Did you account for the fact that sometimes a user's long name pushes its data out of alignment with the rest? If so, how? If not, it probably didn't really have a significant effect on the data. I ask because I can't figure a good way to do it. I'm sure some regular expression could do it quite nicely, but I'm not too skilled at those, and don't know exactly how to go from a regex to columns in a spreadsheet.
Mini-Geek is offline   Reply With Quote
Old 2010-09-28, 14:53   #19
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2×467 Posts
Default

Quote:
Originally Posted by S485122 View Post
Most Spreadsheet programs have a feature to convert text to columns. In Excel 2003 f.i. you find that feature in the Data menu... There are also text handling programs in *nix OS that would do the trick. Another possibility is writing a program and using text handling functions...

Jacob
Jacob,

Very neat. And I did learn something, thanks!

Rodrigo
Rodrigo is offline   Reply With Quote
Old 2010-09-28, 15:14   #20
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
I ask because I can't figure a good way to do it. I'm sure some regular expression could do it quite nicely, but I'm not too skilled at those, and don't know exactly how to go from a regex to columns in a spreadsheet.
Here's a code snippit in Perl to transform the data into CSV format which can be imported into any spreadsheet. "$File" is the name of the status data in plain text (not HTML) format. (I *love* regular expressions!!! )

Code:
   open(IN, $File);

   while (<IN>) {
      if (/^\s*(\d*)\s*(.*)\s+(\d+\.\d*)\s*(\d*)\s*(\d*)\s*\|(.*)/) {
#         print STDERR "Matched on $_";
         $Rank = $1;
         $Name = $2;
         $GHzDays = $3;
         $Attempts = $4;
         $Successes = $5;
         $Deltas = $6;
         print "$1,\"$2\",$3,$4,$5\n";
      }
   }
Obviously you don't need to do all the variable assignements (e.g. "$Rank = $1;") if you're just going to use this to convert to CSV (via the "print" to STDOUT). This is pulled from one of my scripts that processes the data internally rather than simply exporting for processing by another tool.

Also, please note that extracting the 90, 30, 7 and 1 day range changes from $Deltas is a little tricky, and is left as an exercise to those interested....

Last fiddled with by chalsall on 2010-09-28 at 15:24 Reason: I'd quoted $1 rather than $2 (AKA $Name)...
chalsall is offline   Reply With Quote
Old 2010-09-29, 00:48   #21
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by chalsall View Post
Here's a code snippit in Perl to transform the data into CSV format which can be imported into any spreadsheet.
Cool snippet, you're obviously more skilled in the way of regex than I.
The deltas aren't being parsed for me. I didn't really care to try to use them, but I wasn't expecting it to be blank.
Also, it should be noted that on the Top Producers page it only gets the rank, name, and Ghz Days, not any of the other info. Not a big deal to me, just noting it. It's such a different format, it might be pretty hard to write for.

I made some minor modifications to it, and thought I'd share the different version.
Code:
   open(IN, $ARGV[0].'.txt');
   open(OUT, '>' .$ARGV[0].'.csv');

   print OUT "Rank,Name,\"Ghz Days\",Attempts,Successes,Deltas\n";
   while (<IN>) {
      if (/^\s*(\d*)\s*(.*)\s+(\d+\.\d*)\s*(\d*)\s*(\d*)\s*\|(.*)/) {
         $Rank = $1;
         $Name = $2;
         $GHzDays = $3;
         $Attempts = $4;
         $Successes = $5;
         $Deltas = $6;
         print OUT "$1,\"$2\",$3,$4,$5\n";
         $i++;
         if ($i % 500 == 0) {
           print "on line $i\n";
         }
      }
   }
This is usable from the command line. I'll assume you call the file rankings.pl. Copy/paste the rankings in a file like ll.txt. Then run "perl rankings.pl ll" (or, if you've got it set up to all trigger right with just this: "rankings ll"). It will give you a status update to the screen every 500 lines, (so you know it's working and have an idea of where it is) and it will save the CSV version to ll.csv. You can then open that with any spreadsheet program.
Quote:
Originally Posted by Rhyled View Post
Code:
             TF   P-1   LL   LL-D  (Data retrieved 9-26-10)
GHz-Days     11%   4%   83%   12%
Assignments  57%   1%   31%   12%
11+4+83+12=110%
Any idea why your data shows that? I recrunched the data (from the "hourly report generated Sep 28 2010 11:00PM UTC") using chalsall's parsing script, and I found some ratios for GHz-days...that add to 100% :
P-1: 3.71%
LL: 74.61%
DC: 11.12%
TF: 9.80%
ECM: 0.77%

I also compared the number of tests, as given by the "Attempts" line, and got very different results from you. I think this may be because what I got the numbers from records every step of the way, whereas yours looked only at current assignments (perhaps).
P-1: 0.25%
LL: 0.14%
DC: 0.09%
TF: 98.33%
ECM: 1.19%

Oddly, I found a relatively small, but still significant, discrepancy between the summed GHz-Days for the Overall report vs the sums of the individual reports. The individual reports sum to 7812823.76 but the overall report sums to 7868772.93, a difference of 55949.17 GHz-Days. That's about 0.7%. Anyone have a guess as to the reason?
Mini-Geek is offline   Reply With Quote
Old 2010-09-29, 01:47   #22
Rhyled
 
Rhyled's Avatar
 
May 2010

32×7 Posts
Default Banging head on wall

Quote:
Originally Posted by Mini-Geek View Post
11+4+83+12=110%
Any idea why your data shows that?
Sigh - because my denominator was only 3 out of the 4 categories. Stupid mistake. The corrected version, based on my dataset is:
Code:
            TF   P-1  LL  LL-D 
GHz-Days     10%  4%   75%  11% 
Assignments  57%  1%   31%  12% (Current distribution)
As for text handling, I use some shortcuts, especially for one-shot efforts like this. Simply choosing "paste special - text" into Excel got rid of all the annoying arrows and kept it to one line per member. I noticed the right hand side of the report was then fixed format, so all I had to do was find the second "|" in each string and extract substrings based on that position.

I decided not to include ECM figures because I don't associate those with Mersenne primes and it was somewhat difficult to back them out of the Work Distribution List report.

I'm glad you ran the individual reports - I wasn't so motivated. I knew the total attempts on trial factorings would be high, but I didn't expect quite 98%. At this rate, we should have all the trial factoring done for the 100M prime search a decade before we get serious numbers of LL candidates run.
Rhyled is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Top 500 rankings "going backwards"? daxmick Information & Answers 2 2017-12-19 07:39
Question About Rankings LiquidNitrogen Information & Answers 3 2011-07-02 19:51
"successes" in DC rankings? ixfd64 PrimeNet 5 2011-01-31 19:44

All times are UTC. The time now is 10:30.


Fri Aug 6 10:30:16 UTC 2021 up 14 days, 4:59, 1 user, load averages: 3.84, 3.78, 3.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.