mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2015-04-17, 19:35   #45
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

763210 Posts
Default

Quote:
Originally Posted by Madpoo View Post
And then, last question, what data is missing when a factor is found that you would like to see retained? There's a good chance maybe it's still there, but the website just isn't showing it any longer.

I have a feeling that with the server constraints in the past, there was an actual need to keep the DB size from growing too much.

The lost data is in the factoring effort table that stores the TF bit depth and P-1 bounds. This rows in this table are small so DB size was not an issue.

The problem is that the way the database originally designed (way back in 1996) an exponent is expected to be in one and only one of the following tables: known-factors, factoring-effort, known-Mersenne-primes. There is fair amount of PHP and stored procedures that count on this.

If the factoring-effort row is not deleted when a factor is found, then the make-exponent-available-for-assignment stored procedure will break. If an exponent is in the factoring-effort table then the exponent is made available for TF,P-1,LL,or DC. If the exponent is in the factors table then the exponent may be made available for ECM assignments.


Can this be "fixed"? Yes, but it not a risk-less proposition.
Prime95 is offline   Reply With Quote
Old 2015-04-17, 20:59   #46
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2×17×103 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I keep meaning to set something up to get XMLs put together of all old info. I thought about doing it on a per-day basis... would that be a hassle to parse, having to go through 365 of them for a 1-year period?
One XML file per day would be just perfect. No one file should be excessively large, data can be accessed nearly right away (e.g. yesterday's data, rather than waiting until May to get data for April, or rewriting the April file daily). It would actually be less hassle for me to parse daily files than monthly files I think.

Quote:
Originally Posted by Madpoo View Post
Apart from that, is there another data dump I can send your way to help you backfill anything so you can avoid having to crawl it from the website?
All P-1 runs ever done (bounds used, factor found if any, date, user). All TF runs would be great too.

Quote:
Originally Posted by Prime95 View Post
The lost data is in the factoring effort table that stores the TF bit depth and P-1 bounds. This rows in this table are small so DB size was not an issue.
The problem is that the way the database originally designed (way back in 1996) an exponent is expected to be in one and only one of the following tables: known-factors, factoring-effort, known-Mersenne-primes.
That's precisely the missing data that's causing me grief.
As suggested in some thread several months ago, just move the data into another archive table before deleting the row should be a relatively consequence-less modification.

How far back does log data go, and how much could be re-parsed to reconstruct all the data that's been tossed over the years, for TF levels and P-1 runs?
James Heinrich is online now   Reply With Quote
Old 2015-04-17, 21:46   #47
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×32×53 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
How far back does log data go, and how much could be re-parsed to reconstruct all the data that's been tossed over the years, for TF levels and P-1 runs?

I have log files going back to at least 2000. If you combined these logs with the results log table on the server you should be able to reconstruct much of the data. If you are interested, I can zip these old logfiles up and email them to you. Alternatively, there may be a way to upload them to the server and add them to the results log table -- probably the best option.
Prime95 is offline   Reply With Quote
Old 2015-04-17, 22:15   #48
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

DAE16 Posts
Default

Quote:
Originally Posted by Prime95 View Post
If you are interested, I can zip these old logfiles up and email them to you.
I would expect the log files would be bigger than feasible to email, but I'm sure you can get them to me somehow. Merging them into the existing server database would be nice, but I'd still like an offline copy to fiddle around with. Please send them to me.
James Heinrich is online now   Reply With Quote
Old 2015-04-26, 08:18   #49
tha
 
tha's Avatar
 
Dec 2002

827 Posts
Default

There are 21,000+ exponents listed now. Can you say anything about how this list was compiled and how new finds are added to it?

(and it would be handy if we could have a 'from ... to ...' selection mechanism.)
tha is offline   Reply With Quote
Old 2015-04-26, 13:30   #50
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2×17×103 Posts
Default

Quote:
Originally Posted by tha View Post
There are 21,000+ exponents listed now. Can you say anything about how this list was compiled and how new finds are added to it?
It looks at known factors and known P-1 runs, and selects factors where the B1/B2 required to find that factor via P-1 is smaller than a P-1 that didn't find any factor.

The variability of number of exponents on this list is due to poor quality of historical data as mentioned several posts back.
Quote:
Originally Posted by Prime95 View Post
I have log files going back to at least 2000.
I have parsed George's logs and sent them back, hopefully they'll be integrated into PrimeNet data over the next week or few so that real historical data will be available for factored exponents.

Quote:
Originally Posted by tha View Post
(and it would be handy if we could have a 'from ... to ...' selection mechanism.)
I agree, and had thought about it, I'll see if I can wrangle that into existence sooner rather than later.


I happened across a "missing factor" exponent that I had P-1 tested myself. I looked up the result file from 4 years ago and indeed it had failed to find the factor for M58,020,869:
Code:
[Sat Feb 05 23:11:33 2011]
UID: JamesHeinrich/Q6600, M58020869 completed P-1, B1=685000, B2=18152500, We4: DC08ABF9, AID: 00E18F6BDCDCCAD486C7A875A153FB3B
James Heinrich is online now   Reply With Quote
Old 2015-04-26, 20:10   #51
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2×17×103 Posts
Default

Quote:
Originally Posted by tha View Post
it would be handy if we could have a 'from ... to ...' selection mechanism.
Now implemented. The presented data is now also grouped by exponent so that it's clear when multiple factors belong to a single exponent. The count of exponents and factors in the "missed" data is now also shown separately.

Quote:
Originally Posted by tha View Post
There are 21,000+ exponents listed now.
Part of the increase can also be blamed on me finding new data on a number of P-1 runs that have been done on small exponents, and I suspect in many of the cases the tiny factors were already known but P-1 was run anyways to look for other factors. I think this would be one random example of that. Unfortunately I don't have any way of differentiating a P-1 run that explicitly ignored known small factors vs one that simply failed to find a factor (like mine above).
James Heinrich is online now   Reply With Quote
Old 2015-04-27, 11:00   #52
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23·3·72 Posts
Default

Looking at the Brent-Suyama list:
http://www.mersenne.ca/brent-suyama.php
it has a lot of exponent listed with factors found with normal P-1, ECM and even a few found by TF. I don't know if this has always been the case, or if it was a side-effect of importing the missing P-1 results?

For instance found by TF:
M69,599,389
M69,277,711
M69,255,149
M69,243,721
M69,160,681

Found by ECM/normal P-1:
M1,595,057
M1,595,149
M1,595,983
M1,596,667
M1,597,763
M1,400,261
M1,152,517
M1,150,927
M870,047
M1,597

I only really thrust the ~250 of the 1000 that have the Brent-Suyama exponent listed, the rest is probably just noise.
VictordeHolland is offline   Reply With Quote
Old 2015-04-27, 12:05   #53
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

977610 Posts
Default

Indeed, some of those are TF factors, since the old times when reporting a "factor" line without a "no factor" line in front, caused the GPU-TF factors to be recorded as P-1 factors (PrimeNet didn't know about TF-ing "so high" as 74 bits). That issue is fixed now, but the 74 bits TF factors remained stored as P-1 factors.

Last fiddled with by LaurV on 2015-04-27 at 12:06 Reason: s/64/74/
LaurV is offline   Reply With Quote
Old 2015-04-27, 14:55   #54
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·17·103 Posts
Default

As LaurV said, the old manual results form assumed any large factor was found by P-1 (and any really large factor on small exponents was found by ECM) while ignoring any clues as to how it was actually found. Using your first example of M69599389, the actual submitted result line was
Code:
M69599389 has a factor: 12698076768763159146287 [TF:73:74:mfaktc 0.20 barrett76_mul32_gs]
but it's stored as F-PM1. One day hopefully the PrimeNet database can get cleaned up with all those results being reattributed to the correct factor method.

Last fiddled with by James Heinrich on 2015-04-27 at 14:55
James Heinrich is online now   Reply With Quote
Old 2015-04-27, 16:11   #55
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3,313 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
...One day hopefully the PrimeNet database can get cleaned up with all those results being reattributed to the correct factor method.
That sounds like a challenge.

As long as the raw message contains something clearly indicating the method (like the TF in your example) that should be possible.

It may be slow to query since doing partial matching in SQL is generally slow, but if I can whittle down the list of results to check it may not be too bad.

Something along the lines of:
where result_type=pm1 factor and message like '%TF%'

If these results only came from manual clients and they all start with "Mxxx has a factor" then the LIKE clause could be LIKE 'M%TF%' which is actually a lot faster.
Madpoo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
F12 factor found? johnadam74 FermatSearch 16 2016-11-03 12:10
Mfaktc keeps going after a factor is found NBtarheel_33 GPU Computing 11 2012-04-07 21:12
found this factor tha Factoring 4 2007-06-18 19:56
After a factor is found it keeps on going jocelynl Software 6 2004-08-07 01:31
Odd Reporting of a Factor Found Reboot It Data 3 2003-12-03 14:39

All times are UTC. The time now is 12:59.


Mon Oct 18 12:59:13 UTC 2021 up 87 days, 7:28, 0 users, load averages: 1.23, 1.60, 1.67

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.