![]() |
|
|
#144 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
|
|
|
|
|
|
|
#145 |
|
Jan 2006
deep in a while-loop
12228 Posts |
sent email to Max and Gary
Due to the large amount of records (4 million) in the staging table which is un-indexed to allow for duplicates the de-duplication procedure paged to file and bogged down. The design of the de-duplication process is not at fault. The load process for the manual data will have to be done off-line due to the large volume of data. I have resumed the normal scheduled processes for now and will begin the manual load process in an off-line database in 2 days time. AMDave |
|
|
|
|
|
#146 | |
|
Mar 2006
Germany
290810 Posts |
Quote:
As I can say, there're only about 900000 pairs done manually for Drive #1 (for Drive #2 about 270000 pairs). So not much duplicates to process. OTOH perhaps not all automated pairs are yet in the database!? It would be nice to get the ranges from the database to compare with my Summary-page! |
|
|
|
|
|
|
#147 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
My idea is that sometime down the road we can set it up so that we can have the DB do its own comparison of the results it has with the original sieve file; it could tell us exactly which pairs are done and which aren't, produce formatted and sorted results files automatically (thereby automating the process of processing the results for Gary), and maybe even make some nice graphs showing just how far we are on each drive relative to the sieve file. Right now, what I'm doing is just putting the entire LLRnet ranges in the manual dumps under the username "Unknown". Any results which were properly imported from the server the first time around (almost everything) will be rejected as duplicates, while any that were missed will be imported and credited to "Unknown". The disadvantage of this is that it creates loads of duplicates, which seem to be presenting a problem for the DB. I'm currently awaiting a response from Dave as to whether the problem is due to the number of raw results being imported, or just the number of duplicates; if the latter, then we can solve the problem by not including LLRnet ranges under "Unknown" in the manual dumps, but rather checking in the DB later on and manually filling in from the master results files any missing pairs. Since there's not many of them, that shouldn't be hard. |
|
|
|
|
|
|
#148 | |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
7×292 Posts |
Quote:
|
|
|
|
|
|
|
#149 |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
11000011010012 Posts |
I suppose it wouldn't be too hard to duplicate NPLB's database for CRUS. The only potentially tricky part would be the scoring formula: does anyone know what we're currently using for that at NPLB? More specifically, is it limited to just base 2, or does it factor in the effects of the base when determining score?
|
|
|
|
|
|
#150 | |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
10AB16 Posts |
Quote:
To see how PRPnet calculates the decimal length, look in LengthCalculator.cpp. But I'm sure you already know that k*b^n+c is (except in a case where the +c changes the number of digits, I suppose) floor(log(k)+log(b)*n)+1, where log() is in the base you're converting to (in this case, probably 2 or 10). And log_x(y)=log(y)/log(x) (where log_x is the base x logarithm, and log() is any base logarithm). Last fiddled with by Mini-Geek on 2010-04-02 at 16:28 |
|
|
|
|
|
|
#151 | |
|
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
Quote:
Therefore, if we used a similar score for a CRUS DB, I'd suggest that we scale it somewhat to produce values more on the order of what NPLB's formula produces for candidates of similar size.
|
|
|
|
|
|
|
#152 | |
|
May 2007
Kansas; USA
2·41·127 Posts |
Quote:
Yes, we definitely need to go to either bit length or decimal length of the test if we set up something similar to our DB here at CRUS. Gary |
|
|
|
|
|
|
#153 |
|
Jan 2006
deep in a while-loop
2×7×47 Posts |
@henryzz
the base number is already included in the load and stored in the tables, however it is not yet part of the primary key. Adding the base number to the key should achieve the compatibility that you are seeking. Indeed the design of the NPLB tables is easily extensible either to a separate or combined database for additional data sets such as CRUS. However, the CRUS data pages include a lot more manually added meta-data which is not yet catered for, which requires some consideration. Last fiddled with by AMDave on 2010-04-06 at 13:19 |
|
|
|
|
|
#154 | |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
133778 Posts |
Quote:
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| News | gd_barnes | Conjectures 'R Us | 302 | 2021-07-24 20:06 |
| P!=NP in the news | willmore | Computer Science & Computational Number Theory | 48 | 2010-09-19 08:30 |
| Other news | Cruelty | Riesel Prime Search | 41 | 2010-03-08 18:46 |
| The news giveth, the news taketh away... | NBtarheel_33 | Hardware | 17 | 2009-05-04 15:52 |
| News | KEP | Riesel Base 3 Attack | 4 | 2008-12-17 11:54 |