mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2007-07-04, 06:29   #1
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

3×7×167 Posts
Default Archiving the halfway point in a primality test.

Hard drive space is getting super-cheap. I went to tigerdirect.com, looked for the biggest hard drive on the main list(750GB) and did a little math. The 750GB hard drive sells for about $240.00. That about 33.5MB for a penny. A Prime95 first-pass prime file is about 6-7MB. So you could store about 4-5 checkpoint files in a penny's worth of space.

Here's my idea, take it or leave it. I haven't actually crunched the project in a while, this is just me brainstorming. First, take each exponent that's about to be done and divide it by some number, like 75,000,000 or 50,000,000. Round this number up. This is the number of checkpoints you would do. So, if you were doing 42,000,000, your answer would be .84, which rounds up to 1. Add 1 to this number, AFTER the rounding. With my example, we have 2. So you want to mentally divide the exponent by 2(in this case 21,000,000). So you'd have exactly 1 checkpoint at 21,000,000. When the checkpoint is reached, the checkpoint file, along with the checksum for that file, gets backed up to a server to be saved for when someone does a double-check on that number.

But here's where the "magic" is. What if someone runs the doublecheck, and they DON'T match up at the halfway point? Normally, they wouldn't even realize they don't match up, and they'd run it to the end, and nobody would have a clue which one was right, so it would take a minimum of three tests to straighten things out. But now, we have the halfway point of these two tests. So we store both of them in the database, and start a third test from the very beginning, while the first test has been run fully, the second test is only halfway but can be continued at any time, and the triple check. When the triple check gets the checkpoint, a comparison is made, and hopefully it matches up to one of the other two tests. Whichever one it matches up to, we know the other one is bogus and can discard it. If it matches the first one, then the third test is continued and hopefully matches the first one at the end. If it ends and doesn't match the first one, we can restart from the halfway point because we know for a fact that it was correct to the halfway point. If the third test gets to the halfway point and it matches the second, then the first test is discarded, and BOTH tests are taken up by two different people, starting from the save file.

For those reading this, you either understand what I'm saying and are able to mentally fill in the rest, or you're confused. Well, at the moment, it's very late at night and I need to get to bed. I'll be able to respond to this topic in about 14 hours, after I get home from work.
jasong is offline   Reply With Quote
Old 2007-07-04, 12:16   #2
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

I like the idea, but a file that big is a lot to be uploaded and downloaded, especially if the person is on dial-up. My current exponent is in the 38xxxxxx's, and its backup files are 8 MB, but can easily compress (.7z Fast compression, taking 6 seconds on an Athlon XP 1800+@1610 MHz) to ~5.15 MB. I am a gamer and am on a shared internet connection, and I sure wouldn't want it to start uploading or downloading a 5 MB file while I'm in a game or browsing the web...uploading or downloading while browsing makes it go very noticeably slower, and completely kills the game.
For a person on dial-up, it would take over 17 minutes of being connected to be able to upload or download a 5 MB file, along with having the above problems.
For someone not connected (might get their work through manual form on one connected computer), this idea becomes almost completely unfeasible. ("almost" because it would be possible to transfer the file to the connected computer, and have that computer upload it)

I like the idea in theory, but there's just too many problems with it to be able to implement it.
Also, the benefits aren't very large...in the ~1-5% (would it double by having two computers then halve because it's at the halfway point? I'm not sure, not very knowledgeable about probability) of exponents that have errors, it would only eliminate a part of the extra time of a triple-check (if I understand it correctly).
Mini-Geek is offline   Reply With Quote
Old 2007-07-04, 12:42   #3
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

203516 Posts
Default

Why send the save file?

Just send the interim checksums after every million iterations.
Xyzzy is offline   Reply With Quote
Old 2007-07-04, 13:12   #4
hhh
 
hhh's Avatar
 
Jun 2005

1011101012 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
Why send the save file?

Just send the interim checksums after every million iterations.
Imagine the imterim checksums match, the last one doesn't. You still need a full triple test. If you have the file, you save half a test.

BUT: the file isn't needed the first time. If we start saving the checksums now, in a few years, when doublecheck is performed, bandwidth will no more be a matter, and then the files can be saved.

Right?

Yours H.
hhh is offline   Reply With Quote
Old 2007-07-04, 13:27   #5
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

9A316 Posts
Default

The idea has been discussed on the Mersenne mailing list a long time ago. Send 64 bit residues to the server periodically, say every 1M iterations, during the first time test. During the double-check compare these residues and if a residue doesn't match, we know that at least one run must have had an error. A third test can be started to see which one (if any) of the first two tests was correct. If (1) the first and third run's residue agrees, carry on with the third run and compare residues against the first. If (2) the third and second run's residue agrees, continue both. If (3) none of the residues agree, scrap everything and start over from the start with a new first test, perhaps with a higher FFT length.

The advantage is that in case (1), the erroneous double check can be aborted early.

From what I remember, everyone agreed that it was a good idea, but afaik it never got implemented in the server.

Alex
akruppa is offline   Reply With Quote
Old 2007-07-04, 20:01   #6
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

1101101100112 Posts
Default

Even if this idea is only partially implemented(maybe people on phone lines would only upload residues and people with faster lines would upload the whole file), it would still be very useful. Assuming 1 in 100 tests gets an error, that's about a nickel's cost in hard drive space to save a few dollars worth of electricity.

Edit: correction, about 25 cents worth of hard drive space. That's what, 10-15 Euros?

Edit2: Damn, apparently I need to practice mentally applying decimal points.

Last fiddled with by jasong on 2007-07-04 at 20:59
jasong is offline   Reply With Quote
Old 2007-07-04, 20:21   #7
tha
 
tha's Avatar
 
Dec 2002

5·163 Posts
Default

Quote:
Originally Posted by jasong View Post
Edit: correction, about 25 cents worth of hard drive space. That's what, 10-15 Euros?
25 US dollar cents is about 20 Euro cents
tha is offline   Reply With Quote
Old 2007-07-04, 20:34   #8
tha
 
tha's Avatar
 
Dec 2002

5·163 Posts
Default

I proposed this idea half a decadeago or so. I suggested it because there is a substantial amount of exponents reported to the server to have been done for 50, 80% or more and than not completed. These exponents then get reassigned. As far as I remember the reports of the v4 server may not reflect the reality on this point and uploading all intermediate results was considered too much traffic.

After the v5 server has had it's first year of operation it may be worth to reconsider, now that most pc's are connected via ADSL.
tha is offline   Reply With Quote
Old 2007-07-07, 10:38   #9
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

2·3·13·83 Posts
Default

We are talking 40 million bits here (=5MB) aren't we? Surely this is
worth doing these days.
The only questions are:
a) How often?
b) By whom?

David

Last fiddled with by davieddy on 2007-07-07 at 10:57
davieddy is offline   Reply With Quote
Old 2007-07-07, 11:08   #10
davieddy
 
davieddy's Avatar
 
"Lucan"
Dec 2006
England

194A16 Posts
Default

Quote:
Originally Posted by akruppa View Post
...
If (3) none of the residues agree, scrap everything and start over from the start with a new first test, perhaps with a higher FFT length.
...
From what I remember, everyone agreed that it was a good idea, but afaik it never got implemented in the server.

Alex
Re FFT size, the boundaries are on the
conservative side of optimal: George et al
don't want the error rate (due to FFT size) to
approach 20%, because "I don't want to tell
someone he missed a prime because of an avoidable
error"

I sympathise with his point of view.

David

Last fiddled with by davieddy on 2007-07-07 at 11:25 Reason: I sympathise = I give up arguing this topic :)
davieddy is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Modifying the Lucas Lehmer Primality Test into a fast test of nothing Trilo Miscellaneous Math 25 2018-03-11 23:20
there is another way to test the primality of a no shawn Miscellaneous Math 5 2007-07-17 17:55
N-1 primality test Citrix Math 3 2005-09-19 15:06
AKS Primality Test Guilherme Math 2 2004-11-26 05:29
A primality test for Fermat numbers faster than Pépin's test ? T.Rex Math 0 2004-10-26 21:37

All times are UTC. The time now is 22:51.


Fri Aug 6 22:51:43 UTC 2021 up 14 days, 17:20, 1 user, load averages: 4.08, 4.10, 3.92

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.