mersenneforum.org > Data Factor found that should have been found by P-1
 Register FAQ Search Today's Posts Mark Forums Read

2015-04-17, 19:35   #45
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2·7·541 Posts

Quote:
 Originally Posted by Madpoo And then, last question, what data is missing when a factor is found that you would like to see retained? There's a good chance maybe it's still there, but the website just isn't showing it any longer. I have a feeling that with the server constraints in the past, there was an actual need to keep the DB size from growing too much.

The lost data is in the factoring effort table that stores the TF bit depth and P-1 bounds. This rows in this table are small so DB size was not an issue.

The problem is that the way the database originally designed (way back in 1996) an exponent is expected to be in one and only one of the following tables: known-factors, factoring-effort, known-Mersenne-primes. There is fair amount of PHP and stored procedures that count on this.

If the factoring-effort row is not deleted when a factor is found, then the make-exponent-available-for-assignment stored procedure will break. If an exponent is in the factoring-effort table then the exponent is made available for TF,P-1,LL,or DC. If the exponent is in the factors table then the exponent may be made available for ECM assignments.

Can this be "fixed"? Yes, but it not a risk-less proposition.

2015-04-17, 20:59   #46
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

592 Posts

Quote:
 Originally Posted by Madpoo I keep meaning to set something up to get XMLs put together of all old info. I thought about doing it on a per-day basis... would that be a hassle to parse, having to go through 365 of them for a 1-year period?
One XML file per day would be just perfect. No one file should be excessively large, data can be accessed nearly right away (e.g. yesterday's data, rather than waiting until May to get data for April, or rewriting the April file daily). It would actually be less hassle for me to parse daily files than monthly files I think.

Quote:
 Originally Posted by Madpoo Apart from that, is there another data dump I can send your way to help you backfill anything so you can avoid having to crawl it from the website?
All P-1 runs ever done (bounds used, factor found if any, date, user). All TF runs would be great too.

Quote:
 Originally Posted by Prime95 The lost data is in the factoring effort table that stores the TF bit depth and P-1 bounds. This rows in this table are small so DB size was not an issue. The problem is that the way the database originally designed (way back in 1996) an exponent is expected to be in one and only one of the following tables: known-factors, factoring-effort, known-Mersenne-primes.
That's precisely the missing data that's causing me grief.
As suggested in some thread several months ago, just move the data into another archive table before deleting the row should be a relatively consequence-less modification.

How far back does log data go, and how much could be re-parsed to reconstruct all the data that's been tossed over the years, for TF levels and P-1 runs?

2015-04-17, 21:46   #47
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2·7·541 Posts

Quote:
 Originally Posted by James Heinrich How far back does log data go, and how much could be re-parsed to reconstruct all the data that's been tossed over the years, for TF levels and P-1 runs?

I have log files going back to at least 2000. If you combined these logs with the results log table on the server you should be able to reconstruct much of the data. If you are interested, I can zip these old logfiles up and email them to you. Alternatively, there may be a way to upload them to the server and add them to the results log table -- probably the best option.

2015-04-17, 22:15   #48
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

592 Posts

Quote:
 Originally Posted by Prime95 If you are interested, I can zip these old logfiles up and email them to you.
I would expect the log files would be bigger than feasible to email, but I'm sure you can get them to me somehow. Merging them into the existing server database would be nice, but I'd still like an offline copy to fiddle around with. Please send them to me.

 2015-04-26, 08:18 #49 tha     Dec 2002 2×409 Posts There are 21,000+ exponents listed now. Can you say anything about how this list was compiled and how new finds are added to it? (and it would be handy if we could have a 'from ... to ...' selection mechanism.)
2015-04-26, 13:30   #50
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

1101100110012 Posts

Quote:
 Originally Posted by tha There are 21,000+ exponents listed now. Can you say anything about how this list was compiled and how new finds are added to it?
It looks at known factors and known P-1 runs, and selects factors where the B1/B2 required to find that factor via P-1 is smaller than a P-1 that didn't find any factor.

The variability of number of exponents on this list is due to poor quality of historical data as mentioned several posts back.
Quote:
 Originally Posted by Prime95 I have log files going back to at least 2000.
I have parsed George's logs and sent them back, hopefully they'll be integrated into PrimeNet data over the next week or few so that real historical data will be available for factored exponents.

Quote:
 Originally Posted by tha (and it would be handy if we could have a 'from ... to ...' selection mechanism.)
I agree, and had thought about it, I'll see if I can wrangle that into existence sooner rather than later.

I happened across a "missing factor" exponent that I had P-1 tested myself. I looked up the result file from 4 years ago and indeed it had failed to find the factor for M58,020,869:
Code:
[Sat Feb 05 23:11:33 2011]
UID: JamesHeinrich/Q6600, M58020869 completed P-1, B1=685000, B2=18152500, We4: DC08ABF9, AID: 00E18F6BDCDCCAD486C7A875A153FB3B

2015-04-26, 20:10   #51
James Heinrich

"James Heinrich"
May 2004
ex-Northern Ontario

D9916 Posts

Quote:
 Originally Posted by tha it would be handy if we could have a 'from ... to ...' selection mechanism.
Now implemented. The presented data is now also grouped by exponent so that it's clear when multiple factors belong to a single exponent. The count of exponents and factors in the "missed" data is now also shown separately.

Quote:
 Originally Posted by tha There are 21,000+ exponents listed now.
Part of the increase can also be blamed on me finding new data on a number of P-1 runs that have been done on small exponents, and I suspect in many of the cases the tiny factors were already known but P-1 was run anyways to look for other factors. I think this would be one random example of that. Unfortunately I don't have any way of differentiating a P-1 run that explicitly ignored known small factors vs one that simply failed to find a factor (like mine above).

 2015-04-27, 11:00 #52 VictordeHolland     "Victor de Hollander" Aug 2011 the Netherlands 23·3·72 Posts Looking at the Brent-Suyama list: http://www.mersenne.ca/brent-suyama.php it has a lot of exponent listed with factors found with normal P-1, ECM and even a few found by TF. I don't know if this has always been the case, or if it was a side-effect of importing the missing P-1 results? For instance found by TF: M69,599,389 M69,277,711 M69,255,149 M69,243,721 M69,160,681 Found by ECM/normal P-1: M1,595,057 M1,595,149 M1,595,983 M1,596,667 M1,597,763 M1,400,261 M1,152,517 M1,150,927 M870,047 M1,597 I only really thrust the ~250 of the 1000 that have the Brent-Suyama exponent listed, the rest is probably just noise.
 2015-04-27, 12:05 #53 LaurV Romulan Interpreter     Jun 2011 Thailand 3×3,251 Posts Indeed, some of those are TF factors, since the old times when reporting a "factor" line without a "no factor" line in front, caused the GPU-TF factors to be recorded as P-1 factors (PrimeNet didn't know about TF-ing "so high" as 74 bits). That issue is fixed now, but the 74 bits TF factors remained stored as P-1 factors. Last fiddled with by LaurV on 2015-04-27 at 12:06 Reason: s/64/74/
 2015-04-27, 14:55 #54 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 592 Posts As LaurV said, the old manual results form assumed any large factor was found by P-1 (and any really large factor on small exponents was found by ECM) while ignoring any clues as to how it was actually found. Using your first example of M69599389, the actual submitted result line was Code: M69599389 has a factor: 12698076768763159146287 [TF:73:74:mfaktc 0.20 barrett76_mul32_gs] but it's stored as F-PM1. One day hopefully the PrimeNet database can get cleaned up with all those results being reattributed to the correct factor method. Last fiddled with by James Heinrich on 2015-04-27 at 14:55
2015-04-27, 16:11   #55
Serpentine Vermin Jar

Jul 2014

3,313 Posts

Quote:
 Originally Posted by James Heinrich ...One day hopefully the PrimeNet database can get cleaned up with all those results being reattributed to the correct factor method.
That sounds like a challenge.

As long as the raw message contains something clearly indicating the method (like the TF in your example) that should be possible.

It may be slow to query since doing partial matching in SQL is generally slow, but if I can whittle down the list of results to check it may not be too bad.

Something along the lines of:
where result_type=pm1 factor and message like '%TF%'

If these results only came from manual clients and they all start with "Mxxx has a factor" then the LIKE clause could be LIKE 'M%TF%' which is actually a lot faster.

 Similar Threads Thread Thread Starter Forum Replies Last Post johnadam74 FermatSearch 16 2016-11-03 12:10 NBtarheel_33 GPU Computing 11 2012-04-07 21:12 tha Factoring 4 2007-06-18 19:56 jocelynl Software 6 2004-08-07 01:31 Reboot It Data 3 2003-12-03 14:39

All times are UTC. The time now is 00:19.

Sun Sep 26 00:19:29 UTC 2021 up 64 days, 18:48, 0 users, load averages: 1.44, 1.72, 1.80