mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2015-05-17, 19:44   #518
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3,313 Posts
Default Announcement: Daily XML files of results

I recently setup a daily XML file of all of the results that came in. Same info you'd get from the "Recent Results" page, but in a nice XML format, concise, and comprehensive with all of the stuff for that 24 hour period.

If you're interested (and I'll have to announce this separately and maybe put some links somewhere on the site), they're available via:
http://www.mersenne.org/result_archive/<YYYY>/<YYYY-MM-DD>.xml.bz2

There are also annual aggregated files:
http://www.mersenne.org/result_archive/<YYYY>.7z

Of course you can figure out that <YYYY> in the path is the 4-digit year. I've broken down the daily XML stuff into subdirectories by year, and then each filename is like "2015-05-16.xml.bz2"

Ergo, as an example:
http://www.mersenne.org/result_archi...-05-16.xml.bz2

The data starts from /1997/1997-11-11.xml.bz2 and goes up from there. Just know that until August 2007, the log data is spotty in the sense that there will be days with no activity, and then it was all rolled up weekly (that related to the task we just did to integrate those old logs).

So if you crawl a bunch of old stuff, there will be xml files from 1997-2007 where the file is there but there's no data.

If someone were to grab the 1997-2014 aggregated 7z files, it's about 700 MB. There is a 2015.7z file but I think it's only aggregated up to May 2nd when I set all this up.
Madpoo is offline   Reply With Quote
Old 2015-05-17, 20:08   #519
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I recently setup a daily XML file of all of the results that came in.
I for one really appreciate the work you put into making this possible. When I get back from vacation I'll have to rework my spiders to chew on this rather than the smattering of other data that's available elsewhere. Maybe, finally, after many years, mersenne.ca can be (mostly) in sync with mersenne.org
James Heinrich is online now   Reply With Quote
Old 2015-05-17, 20:15   #520
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×3×157 Posts
Default

[QUOTE=Madpoo;402491The data starts from /1997/1997-11-11.xml.bz2 and goes up from there. Just know that until August 2007, the log data is spotty in the sense that there will be days with no activity, and then it was all rolled up weekly.[/QUOTE]

Also be aware that the data is incomplete. Any results manually reported to me by email are not included. Also, not all primenet v5 TF history on exponents above 100M is available.

That said, I'd guess a good 95+% of data is available.
Prime95 is online now   Reply With Quote
Old 2015-05-17, 20:32   #521
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001001112 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I for one really appreciate the work you put into making this possible.
Seconded! Thank's very much! It's going to make fixing mersenne.info a /whole/ lot easier!!!
chalsall is online now   Reply With Quote
Old 2015-05-17, 21:47   #522
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3,313 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Also be aware that the data is incomplete. Any results manually reported to me by email are not included. Also, not all primenet v5 TF history on exponents above 100M is available.

That said, I'd guess a good 95+% of data is available.
Ah, yes, good caveat.

While historical data may be spotty, I *think* these XMLs pull from the same source as the hourly "recent results" anyway, so if you're looking in there to snag the latest and greatest, then at the very least you could just get the daily XML instead of checking that report every hour and parsing it.
Madpoo is offline   Reply With Quote
Old 2015-06-13, 09:17   #523
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

11000010100002 Posts
Default UTF-8/Unicode/encoding is hard

I notice some users show up as blank in the primenet report.

User: André Jordi
http://www.mersenne.org/assignments/...xp_hi=36608483

User: La Güira
http://www.mersenne.org/assignments/...xp_hi=57811861

The common factor appears to be the presence of accented characters.
retina is online now   Reply With Quote
Old 2015-06-13, 11:09   #524
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by retina View Post
The common factor appears to be the presence of accented characters.
I have fixed the issue in the Assignments page.

For Aaron/George: the pages are set (HTML meta header) to UTF8. Usernames at least appear to be stored in the database as latin1 / ISO-8859-1. htmlspecialchars only fixes & " ' < > characters. I changed the htmlespecialchars call to htmlentities, but (since PHP v5.3) the default input character set is UTF8 so you need to specify that input is ISO-8859-1. This problem likely exists elsewhere, you can copy the working code from here as needed.

edit: I don't think I broke it, but mersenne.org is returning 404 for everything for me right now...
edit²: it's back
edit³: for Aaron/George, I also cleaned up the PHP code on this page a bit while I was in there.

Last fiddled with by James Heinrich on 2015-06-13 at 12:04
James Heinrich is online now   Reply With Quote
Old 2015-06-13, 18:32   #525
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

63618 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I have fixed the issue in the Assignments page.

For Aaron/George: the pages are set (HTML meta header) to UTF8. Usernames at least appear to be stored in the database as latin1 / ISO-8859-1. htmlspecialchars only fixes & " ' < > characters. I changed the htmlespecialchars call to htmlentities, but (since PHP v5.3) the default input character set is UTF8 so you need to specify that input is ISO-8859-1. This problem likely exists elsewhere, you can copy the working code from here as needed.

edit: I don't think I broke it, but mersenne.org is returning 404 for everything for me right now...
edit²: it's back
edit³: for Aaron/George, I also cleaned up the PHP code on this page a bit while I was in there.
I can't remember the details, but I made some changes on other pages to make sure those accented characters show up properly in reports. The assignments page might be one I didn't get around to updating.
Madpoo is offline   Reply With Quote
Old 2015-06-13, 18:50   #526
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CF116 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I can't remember the details, but I made some changes on other pages to make sure those accented characters show up properly in reports. The assignments page might be one I didn't get around to updating.
I just looked up how I did it on other pages (like /report_exponent/) ... when grabbing the user name I do this:
'user' => utf8_encode($row['blah blah sql column']),

Then it's utf8 for whatever else PHP wants to do with "user".

I think at the time I looked for other pages where it might pull the user name and somehow I must have missed the assignment page... seems like it pulls that info a little different... I don't know PHP much so I'm not surprised.
Madpoo is offline   Reply With Quote
Old 2015-06-15, 03:09   #527
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

A little bit of a "side" issue, the GIMPS user description on Caldwell's utm.edu page is a lot outdated, c'mon man? only 5 primes?
Additional, it has some typos (like "ober" instead of "over", didn't know George to be German, ).
Some "dates" are missing (you know, wikipedia always adds a small superscript "[when?]" after your text, when the text refers to unclear time or timing), and the number of computers is also outdated... It should say something like "18 primes discovered and 2 million computers participating, at the end of 2015"

Maybe someone can login there and update the text...
LaurV is offline   Reply With Quote
Old 2015-06-20, 13:03   #528
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

24×389 Posts
Default User name mismatches

In this example there has been one/two LL test(s) reported. In the two sections that show this/these LL result(s) the user names are different, namely "Carlo Monari" and "ANONYMOUS" appear to have reported the same residue on the same exponent but managed to insert their results into different sections of the DB. User "ANONYMOUS" has a date attached, but user "Carlo Monari" reported the result without any date.

Perhaps these two users live in a superposition of both entities combined. A gestalt of some unknown characteristic. Reporting a result collapses this gestalt into either one, or the other, but never both at the same time.
retina is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Database design xilman Astronomy 1 2017-04-30 22:25
Theoretical Experiment Design c10ck3r Homework Help 7 2015-02-03 08:54
Digital Logic Design henryzz Puzzles 9 2014-12-04 20:56
new intel design tha Hardware 5 2007-04-19 11:38
design factoring algorithms koders333 Factoring 14 2006-01-25 14:08

All times are UTC. The time now is 21:26.


Sun Aug 1 21:26:52 UTC 2021 up 9 days, 15:55, 0 users, load averages: 1.56, 1.53, 1.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.