mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2007-10-17, 02:08   #1
lavalamp
 
lavalamp's Avatar
 
Oct 2007
London, UK

22·3·109 Posts
Default Sieving Software

Hey all, hope I find you well. If you could spare some time for a newbie, I'd much appreciate it. I've only been at this a couple of days so I've got a few, what I assume to be, quick questions about sieving software. I'm afraid that while I was able to learn a lot via Google, there are some things I haven't quite grasped yet.

I've been using NewPGen to sieve a couple of ranges (b.2^n-1 fixed k), and it's really nice and easy to use, primarily due to the GUI. However, I've been reading around and it seems that apparently sr2sieve is faster.

So today I downloaded sr2sieve, and I'm pretty sure I've got it sufficiently worked out, I mean a GUI is nice, but I can handle a CLI too. There's a difference between sr2sieve and NewPGen though, NewPGen outputs a nice-and-easy-to-LLR file with remaining candidates in (and also the removed candidates and their factors in a seperate file if I tell it to), sr2sieve only outputs the factors.

So, after this long lead up, my first question is, how can I either:
* get sr2sieve to output the remaining candidates (something that I think srsieve can do, but not sr2sieve)?
or
* compare the list of factors with the input file and remove any factored numbers from the input list?

My second question is not so much to do with sieving as it is to do with checking the sieving. Suppose I have a list of factors, either from NewPGen or sr2sieve, how can I check through the list to make sure that there are no mistakes? It would be a shame if the sieve mistakenly took out a candidate that was actually prime, and I realise this is very unlikely, but better to be safe than sorry eh?

Oh, I do actually have a third question. From what I gather (at least with NewPGen (I think)), the speed at which the sieve sieves, is proportional to the square root of the number of candidates. Assuming this is true, it's therefore quicker to sieve for n = 1,000,000 to 3,000,000 rather than n = 1,000,000 to 2,000,000 and then once done n = 2,000,000 to 3,000,000.

So, it's obviously better to sieve 1 to 3 M all in one go, but suppose I wanted to sieve two ranges with different k's. It seems it would be faster to sieve them simultaneously, however I'm not sure how I would go about it.

For example, the ranges might be as follows:
3 x 2^n - 1, n = 1 to 3 M
5 x 2^n - 1, n = 1 to 3 M

Is it possible to sieve ranges for two (or more) k's and for a range of n's at the same time? And if so, how would I do it? Would it even be faster?

If you've read this far, I apologise for any eye strain I may have caused and thank-you for your time. Any light you can shine on the answers to these questions would be very well received. Have a good day and happy crunching!
lavalamp is online now   Reply With Quote
Old 2007-10-17, 16:58   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

10001001010002 Posts
Default

If you are running just one k-value, sr2sieve is the wrong program. It is designed for many k's at once, for large searches like the Riesel-Sieve project. However, it *is* used for running k=3 and k=5 at the same time (more in a moment about this last question).

sr1sieve is the program you want for a single k-value. It accepts a file in NewPGen format, and outputs in the same format. It automatically removes factors found from the file, addressing your initial two questions. If you run sr1 with the -f "factorsfoundfile.txt" option (don't use quotes on command line), it will also output a factors file (useful if you split sieving across two machines- you only need to copy the factors-found file back to original machine to merge).

Back to your questions about sr2. sr2 is designed for very large projects, split over many users and many machines. Since factors-found files are the most efficient way to report work complete, that is the default method for sr2 output. To use this info to trim your candidate file, you need srfile.

srfile has quite a few options to convert from one form to another, remove factors, etc. use -k to remove factors, -a to convert a list of NewPGen files into a single ABCD file for sr2 use:
srfile -a "3.txt" "5.txt" will create an sr2.abcd file, containing BOTH sieves, ready for sr2 processing.

the -g option converts a file back to LLR/NewPGen format for testing. if you apply -g to the sr2.abcd mentioned above, multiple single-k files will be created, one for each k in the big sr2 sieve.

sr1 is your easiest solution, though. given a newpgen file (say 3.txt), usage:
sr1sieve -i 3.txt -o 3.txt -f 3factors.txt -v (verbose, to see more of what the program is doing for curiosity) -P 3e12
3e12 is the ending p-value to sieve to. This is necessary- unlike NewPGen, you cannot just turn it on and let it run. If you would prefer this, set -P to something like 1e14!

You may set a starting p-value with lowercase p as a flag. This overrides the top line in the sieve file. If you don't set a -o outputfile.txt, the factors will not be removed from the input file. Common usage is to make -o and -i refer to the same file.

Finally, you'll find that sr1 is much faster than sr2 for a single k value.
-Curtis
VBCurtis is online now   Reply With Quote
Old 2007-10-18, 15:12   #3
lavalamp
 
lavalamp's Avatar
 
Oct 2007
London, UK

22·3·109 Posts
Default

Thanks for you response. Following your suggestion I downloaded srsieve and ran a few test sieves just to get a feel for it.

I found something that I'm not quite sure what to make of though. First up I tested the range 27*2^n-1 for n=0 - 100 and p up to 10,000,000. Then I tested the outputted .npg file for p up to 20,000,000.

Then I sieved the same range again with a new command on the command line, but for p up to 1,000,000,000, and it found a factor for n=50 p=16,043,893 that it didn't find before.

I tried a couple of different ranges and found that when continuing on from a previous file, it seemed to not remove some candidates that are composite. I've uploaded a screen shot of the DOS box and added some colour coding to make the appropriate sections stand out.

I can only think that I'm making some mistake, but I can't think what. I suppose the most obvious one would be that I'm simply feeding it the wrong npg file, but to test that theory I cleared the folder of all files other than srsieve.exe and therefore the only npg file in there was the one it created on the first run.

I also tried the same tests without the --newpgen flag, the results in srsieve.out were the same.
Attached Thumbnails
Click image for larger version

Name:	mismatch.png
Views:	139
Size:	46.6 KB
ID:	1981  
lavalamp is online now   Reply With Quote
Old 2007-10-18, 15:46   #4
Patrick123
 
Patrick123's Avatar
 
Jan 2006
JHB, South Africa

157 Posts
Default

According to your screen shot, the factor for 27*2^50-1 is above 10^7 that's why it was not found in the prior sieve. Likewise for the factor of 27*2^80-1.
Patrick123 is offline   Reply With Quote
Old 2007-10-18, 17:53   #5
axn
 
axn's Avatar
 
Jun 2003

10010011010112 Posts
Default

Quote:
Originally Posted by lavalamp View Post
Thanks for you response. Following your suggestion I downloaded srsieve and ran a few test sieves just to get a feel for it.

[snip]

I also tried the same tests without the --newpgen flag, the results in srsieve.out were the same.
That is very interesting. I am sure geoff would be interested in getting to the bottom of this. Maybe you could send him a PM.
axn is offline   Reply With Quote
Old 2007-10-18, 18:18   #6
lavalamp
 
lavalamp's Avatar
 
Oct 2007
London, UK

22·3·109 Posts
Default

Patrick, the factor for 27*2^50-1 is a little over 16 M, but it was not found in the first search to 20 M (the second command given, about a quarter of the way down). The factor for 27*2^80-1 is 111.3 M, but it was not found in the first search to 1000 M.

axn, I'll send a PM to geoff then after writing this, but I still get the feeling that I'm probably just doing something wrong.
lavalamp is online now   Reply With Quote
Old 2007-10-18, 18:32   #7
axn
 
axn's Avatar
 
Jun 2003

126B16 Posts
Default

Quote:
Originally Posted by lavalamp View Post
but I still get the feeling that I'm probably just doing something wrong.
Maybe. But, looking at the screenshot, there wasn't anything obviously wrong. I mean, srsieve did resume from the correct p level and all. That's why I wanted geoff to look at this. Maybe there is a corner case that needs to be handled (maybe due to the very small size of the range) or something. Lotsa maybe's
axn is offline   Reply With Quote
Old 2007-10-18, 21:41   #8
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

5×701 Posts
Default

For a lot of sieving programs, if both the p and n values are very low, there tend to be errors. For instance, if you're starting a base-2 sieve on a bunch of numbers, it's generally recommended that use NewPGen to sieve each k up to 1e9, then combine the ks and sieve up to 5e9 using srsieve. And ONLY THEN switch to sr2sieve to sieve anything higher.

I have a bad memory, so I forget what the problem values were, but it may be connected to what I just said above.

Hope this helps.
jasong is offline   Reply With Quote
Old 2007-10-20, 03:30   #9
lavalamp
 
lavalamp's Avatar
 
Oct 2007
London, UK

51C16 Posts
Default

Jason, I don't really understand why a sieve would miss factors just because the numbers involved are small. But even if sieves do miss factors sometimes, it doesn't explain why srsieve would only miss certain factors when resuming from a previous file. In any case, geoff has replied and said that he'll check into it.

To backtrack though, I didn't get an answer to my second question about checking the removed candidates. Is there some utility I could run on NewPGen.del or factors.txt to make sure that all candidates removed do actually divide by the primes the sieve found them to? Compared to the sieving programs, I would imagine that a checking utility would be a relatively simple matter, but I have been unable to turn up anything with google.

I know that NewPGen includes a function that will double check any factors that it finds, and I do use that, however it would be nice to have a separate utility to check the numbers than the one that finds them.
lavalamp is online now   Reply With Quote
Old 2007-10-20, 21:24   #10
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Quote:
Originally Posted by lavalamp View Post
I found something that I'm not quite sure what to make of though. First up I tested the range 27*2^n-1 for n=0 - 100 and p up to 10,000,000. Then I tested the outputted .npg file for p up to 20,000,000.

Then I sieved the same range again with a new command on the command line, but for p up to 1,000,000,000, and it found a factor for n=50 p=16,043,893 that it didn't find before.
This is a bug, it is fixed in srsieve version 0.6.7. Thanks for the clear report, it made the bug easy to track down. More details here.
geoff is offline   Reply With Quote
Old 2007-10-20, 23:07   #11
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13×89 Posts
Default

Quote:
Originally Posted by lavalamp View Post
Is there some utility I could run on NewPGen.del or factors.txt to make sure that all candidates removed do actually divide by the primes the sieve found them to? Compared to the sieving programs, I would imagine that a checking utility would be a relatively simple matter, but I have been unable to turn up anything with google.
Here is a simple utility to do this: http://www.geocities.com/g_w_reynold...ck-factors.zip

If your factors are in factors.txt, running it from the shell using redirection:

check-factors < factors.txt

will print the non-trivial correct factors only. The pre-compiled executable only works for factors up to 2^62, but there is source for a GMP version that can check factors of any size.

You can also use a general-purpose calculator program to check individual factors: to check that p divides k*b^n+c, check that (k*b^n+c)%p is equal to zero.

Quote:
I know that NewPGen includes a function that will double check any factors that it finds, and I do use that, however it would be nice to have a separate utility to check the numbers than the one that finds them.
srsieve will also double-check found factors if you use the -c switch. sr1sieve and sr2sieve always double-check all factors.
geoff is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Generalized Cullen/Woodall Sieving Software rogue And now for something completely different 13 2014-12-29 19:11
Suggestion for new sieving software ATH Factoring 3 2012-04-04 13:03
new sieving software Thomas11 Riesel Prime Search 81 2011-11-05 19:16
Line sieving vs. lattice sieving JHansen NFSNET Discussion 9 2010-06-09 19:25
Software TTn PSearch 0 2004-05-04 13:16

All times are UTC. The time now is 19:15.

Sun Oct 25 19:15:23 UTC 2020 up 45 days, 16:26, 0 users, load averages: 2.27, 2.00, 1.79

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.