mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > FactorDB

Reply
 
Thread Tools
Old 2018-04-02, 02:11   #1
shortcipher
 
Mar 2018

1710 Posts
Default Accessing FactorDB from Python

I am accessing the database using FactorDB from a Python library module (https://pypi.python.org/pypi/factordb-pycli). For some numbers (about 120 digits), the Python call just gets a 'C' response, even after many attempts. If I enter the same number in my browser at http://factordb.com/ and command Factorize!, it factors the number immediately. The Python call then gets a 'FF' response and obtains the factorization.

Is there some secret sauce in the browser interface which is denied to the Python interface?
shortcipher is offline   Reply With Quote
Old 2018-04-02, 02:19   #2
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·2,399 Posts
Default

FactorDB doesn't autofactor anything above 70 digits. Which means if FF is appearing in your browser, then that factorization was already in the DB.

I'm not sure a priori what could be causing this behavior. The aliquot sequence status page does a ton of FDB queries (obviously), and although we get errors from time to time, retrying a handful of times eventually rectifies the error.

Is this problem reproducible? If so, what numbers specifically?
Dubslow is offline   Reply With Quote
Old 2018-04-02, 02:49   #3
shortcipher
 
Mar 2018

17 Posts
Default

Right now, 36868743542001151893415408511062812705584118566362468784537941808688929129381217798445710341479658115867192826982725123 is getting a 'C' via Python.

Is it possible to check whether this number is in the database, without any possibility of autofactoring?
shortcipher is offline   Reply With Quote
Old 2018-04-02, 03:29   #4
shortcipher
 
Mar 2018

17 Posts
Default

Now that number is 'FF' from Python.

This one is still 'C':-

52366186180679306335144096639289985739377662519577838330283010732839432648885536870843823075056707630585377852887738877
shortcipher is offline   Reply With Quote
Old 2018-04-02, 04:05   #5
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·2,399 Posts
Default

Can you paste output logs of what you see in your Python calls? I don't understand what's happening. Both of those numbers are FF, and when I query the FDBID in Python I get the correct status FF.
Dubslow is offline   Reply With Quote
Old 2018-04-02, 04:28   #6
shortcipher
 
Mar 2018

17 Posts
Default

Here is what I am getting from a new number (the previous one is now 'FF').

The printout is from
Code:
         f = FactorDB(n)
        print f.connect().json()
in my Python code.

Code:
{u'status': u'C', u'id': u'1100000001115460681', u'factors': [[u'72532046707394527184361122364055313434377630022988043070962959961740902025113107530697959132646327596976526693065378307', 1]]}
How are you querying the FDBID in Python?
shortcipher is offline   Reply With Quote
Old 2018-04-02, 05:41   #7
shortcipher
 
Mar 2018

17 Posts
Default

More explicitly, with just this Python code:-

Code:
>>> import requests
>>> requests.get("http://factordb.com/api", params={"query": str(72532046707394527184361122364055313434377630022988043070962959961740902025113107530697959132646327596976526693065378307)}).json()
{u'status': u'C', u'id': u'1100000001115460681', u'factors': [[u'72532046707394527184361122364055313434377630022988043070962959961740902025113107530697959132646327596976526693065378307', 1]]}
shortcipher is offline   Reply With Quote
Old 2018-04-02, 06:26   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·2,399 Posts
Default

Those both show FF for me in Python.

I'm not sure what precisely is going on. However, I can make an educated guess.

Going back to the number you gave in post 4: http://factordb.com/index.php?id=1100000001115431883

When I click "More information", this number was added approximately 15 hours ago, perhaps 12 hours before you made the post stating that it still appeared as "C". However, the factors are listed as found in early February. So what sort of numbers are these? These seem to be small multiples of other numbers already fully factored in the db?

As for your Python, you said "even after many attempts". How long of a sleep between each request do you use? The FDB is rather notoriously slow about such things, so upon being queried the first time, it often takes several seconds for everything to be fully processed. So if your retry queries have no sleep time between them, and your "many attempts" are all within 1-2 seconds of each other, then that 1-2 seconds is insufficient for the DB to finish processing the new number, and you get C. Then when you check in your browser, at least several if not dozens or hundreds of seconds later, the processing has completed and everything shows normally.

The page I mentioned in passing before, the aliquot sequence status page, queries the status of 100 sequences per half hour, first by querying the known ID of the current line, and then if that has changed to fully factored, using the aliquot sequence page on the FDB to get the latest line, which is itself then processed and stored. Occasionally, when a sequence has had new lines added but no one has queried those new lines, the processing of the newly-added lines gets deferred until someone actually triggers the sequence page on the FDB, and so the blue page's automated query is often the first trigger to actually process all those new lines, meaning that the query about the last unfactored line turns up garbage for several seconds while the newly-added lines are processed. This is analogous to you querying these small-multiples-of-fully-factored numbers for the first time, which turns up a C. In the rare cases that these aliquot sequence queries get garbage, the solution is simple: sleep 5 seconds before trying again. The 5 second sleep is critical to give FDB time to finish processing the stuff that we triggered. After 5 retries, then it quits with an error for that sequence and moves on. Perhaps one sequence out of those hundred-per-half-hour runs into this garbage scenario, on average, and of those, another 99/100 are just fine after up to 25 seconds of querying/sleeping, while ~1 in 10,000 get garbage even after 5 slow retries -- but of those 1 in 10,000, literally none ever cause a problem when queried again in the next half-hourly batch.

So this is what it looks to me is happening. If you can confirm the nature of these numbers that you're querying, and that your "many attempts" are without any sleep between them, and/or change it to be so, that should clear this up.

(As for how I'm querying the FDB -- per the above link, it's just some crappy hand-rolled parsing of the literal HTML webpage. I didn't even know there was an API. I first started this page many years ago, and at the time it occurred to me that there was probably some value in a dedicated Python-FDB interface package, but I never got around to actually doing anything about. In the months-ago rewrite I just slightly cleaned up and reorganized the simple HTTP/HTML handling stuff that had already been in use for years.)
Dubslow is offline   Reply With Quote
Old 2018-04-02, 07:00   #9
shortcipher
 
Mar 2018

17 Posts
Default

The numbers are taken from an aliquot sequence project I am running and don't necessarily relate to any existing factors in the database.

I was doing 6 attempts with 10 seconds of sleep between them.

I suspect that the numbers which later became FF (including the latest one 72532046707394527184361122364055313434377630022988043070962959961740902025113107530697959132646327596976526693065378307) did so after you accessed them with your HTML-based request. So I'll keep the latest one secret and check if it ever comes good.

I'm pretty sure that if I accessed the database with
Code:
requests.get('http://www.factordb.com/index.php?query=%s' % str(n))
it would return FF immediately.

The question is, who is responsible for the API at factordb.com/api? I think this is where the problem lies.
shortcipher is offline   Reply With Quote
Old 2018-04-02, 08:29   #10
DukeBG
 
Mar 2018

3·43 Posts
Default

I've seen the behavior that the small numbers are factored when serving the page in browser (smallest factors added, like below 8-9 digits). I imagine there are some calls in /index.php that are just missing in /api.

Last fiddled with by DukeBG on 2018-04-02 at 08:31
DukeBG is offline   Reply With Quote
Old 2018-04-02, 13:23   #11
10metreh
 
10metreh's Avatar
 
Nov 2008

2·33·43 Posts
Default

Quote:
Originally Posted by shortcipher View Post
The numbers are taken from an aliquot sequence project I am running and don't necessarily relate to any existing factors in the database.
The thing is, they do.

It appears that you are computing the aliquot sequence of 2^(p-1)*(2^p-1)*3 for some Mersenne prime 2^p-1. If a has no common factors with 2^(p-1)*(2^p-1), then σ(2^(p-1)*(2^p-1)*a) = σ(2^(p-1)*(2^p-1))*σ(a) = 2^p*(2^p-1)*σ(a). Hence if 2^(p-1)*(2^p-1)*a is a term in your aliquot sequence, then the next term is 2^(p-1)*(2^p-1)*(2σ(a)-a).
But the value 2σ(a)-a does not depend on p, so for each Mersenne prime, 2^(p-1)*(2^p-1)*3 has essentially the same aliquot sequence, with just the power of two and the Mersenne prime differing. The pattern only breaks when a term happens to have a second factor of 2^p-1, i.e. the value a above has a common factor with 2^(p-1)*(2^p-1). This is very unlikely for large p.

For 2^13-1 and all greater Mersenne primes, we have not yet found such a term, so all of these sequences are still on the same trajectory. Thus when we compute a new term of the aliquot sequence of 2^12*(2^13-1)*3, we get a new term of the sequence of 2^(p-1)*(2^p-1)*3 for all larger Mersenne primes.

The aliquot sequence of 2^12*(2^13-1)*3 (http://factordb.com/sequences.php?se...t20&fr=1&to=20) is known up to index 863; the factors you posted earlier in the thread come from terms 817 and 818, so whichever Mersenne prime you were actually using, you were in fact redoing work that has already been done.
10metreh is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 error accessing files (work, log, backup) C0REY Information & Answers 5 2017-07-19 05:56
Accessing SMB shares from Windows Cruelty Linux 5 2007-11-17 23:15
Errors accessing files after system reload rx7350 Software 4 2006-02-15 13:04
Help w/ python. a216vcti Programming 7 2005-10-30 00:37
Problems accessing the GIMPS source code timing pages ewmayer Software 1 2002-11-10 19:40

All times are UTC. The time now is 07:41.

Mon Nov 30 07:41:10 UTC 2020 up 81 days, 4:52, 3 users, load averages: 1.31, 1.31, 1.33

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.