mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Proth Prime Search (https://www.mersenneforum.org/forumdisplay.php?f=109)
-   -   Question about sieving (https://www.mersenneforum.org/showthread.php?t=25083)

David703 2020-01-06 13:36

Question about sieving
 
Hi, I'm currently using newpgen to sieve a range for proth primes but I suspect the software that primegrid uses (sr2ppsieve, in my case I would use the CUDA version) would be much faster than that. Am I right? If yes, were can I download that program?

rogue 2020-01-06 14:02

[QUOTE=David703;534401]Hi, I'm currently using newpgen to sieve a range for proth primes but I suspect the software that primegrid uses (sr2ppsieve, in my case I would use the CUDA version) would be much faster than that. Am I right? If yes, were can I download that program?[/QUOTE]

What range are you sieving? Are you sieving fixed k or fixed n? Just want to see if you are using the optimal software for sieving as newpgen is much slower than many other sieveing programs, depending upon what you are sieving.

David703 2020-01-06 14:18

Hi! I was thinking to get into prime testing outside big projects (GIMPS and PrimeGrid), so I was just experimenting a little bit with random values.

I thought that the only possible sieving was with fixed n so I was doing n=4156789 and Ks between 75000 and 95000 (as I stated, completely random). What ranges should I sieve with which program? I'm completely new to this and had no idea that the optimal program to use for sieving depended on the range.

rogue 2020-01-06 21:20

[QUOTE=David703;534405]Hi! I was thinking to get into prime testing outside big projects (GIMPS and PrimeGrid), so I was just experimenting a little bit with random values.

I thought that the only possible sieving was with fixed n so I was doing n=4156789 and Ks between 75000 and 95000 (as I stated, completely random). What ranges should I sieve with which program? I'm completely new to this and had no idea that the optimal program to use for sieving depended on the range.[/QUOTE]

My recommendation is to get your feet wet with a smaller project, i.e. one with fewer participants. You will have less software to choose from and can communicate directly with the person maintaining that project. The project maintainer might also give you some ideas for small range to test until you are comfortable with the software.

The simplest ones that quickly come to mind are the [URL="http://guenter.loeh.name/gc/status.html"]Generalied Cullen[/URL] and [URL="http://harvey563.tripod.com/"]Generalized Woodall[/URL] searches. I call these "simplest" because the form of the prime is easy to identify and have no special symbols. The search spaces are also fairly small compared to other projects so one quickly gets into the Top 5000 prime territory.

Next up would be [URL="http://www.noprimeleftbehind.net/Carol-Kynea-prime-search.htm"]Carol/Kynea[/URL] or [URL="http://mfprimes.cba.pl/"]Multifactorial[/URL]. These forms are a little harder to understand as they have symbols beyond the +, -, ^, and * operators.

From there you find projects with more participants which means you are more at risk for poaching or searching ranges that have already been searched. This would include the Proth Prime Search, the Riesel Prime Search, and CRUS. The Riesel and Sierpinski forms have a lot of searchers and multiple projects, depending upon their focus. The search space is much larger than most other projects.

For most of the smaller projects you are likely to use a program built upon the mtsieve framework. For full disclosure I wrote mtsieve and have assisted more prime searching projects here or at PrimeGrid than most others. I have not participated on GIMPS and my participation of PrimeGrid projects occurred before PrimeGrid grew to its current size.

Once you have spent some time on smaller projects and have hopefully made some contributions, you should have more confidence to contribute to the larger projects without anyone denigrating any of your work.

bur 2020-08-30 16:21

Just for fun I wanted to do some manual hunting. To keep the CPUs on Primegrid, I want to do it with GPU only. I appears Proth primes are good candidates since there is both sieving and prime testing software for GPU available.


I want to have a fixed k (in 10^6 range far from PG's range) and sieve for n values around 10^6. Is sr2sieve the best choice?


For primality testing the sieved candidates I want to use proth20, anyone used that before?

rogue 2020-08-31 00:23

[QUOTE=bur;555462]Just for fun I wanted to do some manual hunting. To keep the CPUs on Primegrid, I want to do it with GPU only. I appears Proth primes are good candidates since there is both sieving and prime testing software for GPU available.

I want to have a fixed k (in 10^6 range far from PG's range) and sieve for n values around 10^6. Is sr2sieve the best choice?

For primality testing the sieved candidates I want to use proth20, anyone used that before?[/QUOTE]

You would have to test with proth20 to see how well is works. Then you would have to run some of the same tests with llr to see which one is faster.

bur 2020-08-31 06:59

[QUOTE]Then you would have to run some of the same tests with llr to see which one is faster.[/QUOTE]Is there anything worthwhile other than LLRCuda?


Do you have any recommendation regarding newpgen vs sr2sieve? Or should I just try both?


Also most Proth searches seem to focus on small k. Not just k < 2^n, but k <<< 2^n. Is there a reason for it, like higher efficiency per number of digits?


And is there anything "special" about k values that are prime? I know there was some activity to find a prime k Cullen/Woodall, but not about Proth primes with k being prime.

JeppeSN 2020-09-01 08:44

[QUOTE=bur;555496]Also most Proth searches seem to focus on small k. Not just k < 2^n, but k <<< 2^n. Is there a reason for it, like higher efficiency per number of digits?


And is there anything "special" about k values that are prime? I know there was some activity to find a prime k Cullen/Woodall, but not about Proth primes with k being prime.[/QUOTE]

One reason for doing very tiny k is that they are aesthetically more pleasing. Another reason is that the probability that each prime you find, will divide a Fermat number or a generalized Fermat number (with small base(s)), is higher for low k. The probability is 1/k, heuristically, for each base.

But PrimeGrid is already doing all the low k in its different PPS (Proth Prime Search) flavors (PPS-DIV, PPS, PPSE, PPS-MEGA) and for k=3 in their "321" subproject. So to not conflict with PrimeGrid, in my opinion, you must pick a k over 10'000.

For Proth candidates, there is nothing special about k that are prime (other than, again, aesthetics; many people's favorite numbers are primes).

There is something special about k that are perfect cubes, fourth powers and so on; they admit algebraic factorizations of some candidates. So if your k happens to be of this type, be sure that the sieving software, or other software, removes candidates that have special factorizations.

/JeppeSN

bur 2020-09-01 13:40

Thanks, I was planning k ~ 10^6 and maybe also k that are prime.



I have some questions about sieving:



I tried sr2sieve and realized I don't really understand the output. It created a factors.txt which showed the factors and the number they divide found during sieving. But don't I need the opposite, a file that lists all potential candidates? And this file I'll feed to the primality test software? How do I tell sr2sieve to create that file?



Also the readme said sr2sieve only handles factors larger than k, for smaller factors one should use newpgen oder srsieve. I think this isn't optional but highly recommended? Otherwise all numbers divisible by 3 or 5 or other small factors will remain as candidates.


So I would use srsieve to sieve for factors <= k and then sr2sieve for factors > k?

VBCurtis 2020-09-01 15:20

[QUOTE=bur;555650]
So I would use srsieve to sieve for factors <= k and then sr2sieve for factors > k?[/QUOTE]

Yep! I usually use srsieve to 100M or 500M, and then sr2sieve. At really tiny factor sizes, writing to screen makes sr2sieve slower than srsieve.

One uses srfile to manipulate the files. srsieve outputs an srsieve.out file. The command "srfile -a srsieve.out" converts this file to the abcd format that sr2sieve uses.

sr2sieve creates factor files only because it is designed to be used across multiple machines- in that use case, one wants to collect all the factor files from the various instances and apply them all to a single input file to remove all the factors. So, again srfile is used to do so:
srfile -k factors.txt sr_{base}.abcd -a

Or, if you're ready for LLR, you'd use -G at the end rather than -a to write a file in format LLR prefers. srfile -h will show you all the available formats.

rogue 2020-09-01 17:19

If using sr2sieve with a k in that range, make sure that you generate the Legendre tables. That should give it a nice speed boost.

bur 2020-09-01 17:31

Thanks, I managed to use both srsieve and sr2sieve. Once you know what to do it's really easy to transfer work from one to the other.


However, I couldn't figure out how to continue sieving with sr2sieve at a later time with higher pmax. srsieve notes in the abcd file up to which value sieving was performed, so when I specify that file as input pmin is automatically adjusted. As far as I can see, sr2sieve does no such thing.


[QUOTE=rogue;555673]If using sr2sieve with a k in that range, make sure that you generate the Legendre tables. That should give it a nice speed boost.[/QUOTE]Does the -c switch do that?


Also I noticed the --threads option is not supported in my version 1.8.11, is that normal? It only runs on one core now...

VBCurtis 2020-09-01 19:33

-p tells sr2sieve where to start sieving, -P where to end. The input file is untouched, so you could also edit the top line manually to update pmin.

-t is used for threads, only up to 8. I'm not sure what happens when you specify more than 8.

bur 2020-09-01 19:36

Ok, manually updating the abcd file is what I do now. I thought is was a neat feature of srsieve to include this info in the abcd file, why doesn't sr2sieve?


The -t option isn't accepted. I just get the message: "unknown option -- t".

VBCurtis 2020-09-01 20:20

[QUOTE=bur;555686]Ok, manually updating the abcd file is what I do now. I thought is was a neat feature of srsieve to include this info in the abcd file, why doesn't sr2sieve?
The -t option isn't accepted. I just get the message: "unknown option -- t".[/QUOTE]

Again, sr2sieve is designed to work over multiple machines and some sneakernet. If you have machine #1 sieve 10T to 30T, and machine #2 sieve 30T to 50T, machine #2 would be mistaken to mark the input file as sieved to 50T for two reasons:
1. The factors haven't been removed yet
2. The factors from 10-30T aren't on machine #2.

It is more consistent and leads to fewer mistakes in software usage for sr2sieve to not change the input file at all.

Edit: I think threads only works on linux. If you're using windows, pretty sure you're out of luck. You could try the linux shell within win10?

rogue 2020-09-01 21:42

Correct -t is not enabled on Windows.

Use -h to find the option to use Legendre lookups.

bur 2020-09-02 09:26

VBCurtis, you're right, makes sense not to change the abcd file when it's the input file. But after sieving, when I convert the sr2sieve output with srfile, then it would be helpful if that information would be written in the abcd output file.


rogue, ok, so best way for fixed k is to split the n-range and run several sr2sieve simultaneously?


[QUOTE]Use -h to find the option to use Legendre lookups.[/QUOTE]That's where I got -c from, but I'm not sure if that's all that's required. It only says "save symbols" while -C says "load/save".

rogue 2020-09-02 13:04

[QUOTE=bur;555748]VBCurtis, you're right, makes sense not to change the abcd file when it's the input file. But after sieving, when I convert the sr2sieve output with srfile, then it would be helpful if that information would be written in the abcd output file.


rogue, ok, so best way for fixed k is to split the n-range and run several sr2sieve simultaneously?


That's where I got -c from, but I'm not sure if that's all that's required. It only says "save symbols" while -C says "load/save".[/QUOTE]

Do not split the range of n. That will not speed things up.

By default it will generate the tables. -x disables that. Using -c/-C will save you time when you start up as you can generate the tables once then re-use them.

VBCurtis 2020-09-02 15:42

[QUOTE=bur;555748]rogue, ok, so best way for fixed k is to split the n-range and run several sr2sieve simultaneously?.[/QUOTE]

Like I said a few posts ago, sr2sieve is designed to split over many clients, each with its own range of p to sieve. Give each client (or core) its own -p and -P. If you run them from the same folder, I suggest also giving each one its own -f factorfilename.

Or, get linux working in a virtual machine; that's how many of us got started in the conversion out of windows.

bur 2020-09-03 17:11

[QUOTE]Do not split the range of n. That will not speed things up[/QUOTE]I realized it's apparently not like that... ([I]My intention was to utilize all four physical cores since the -t option doesn't work under Windows. Won't it speed up overall progress (not progress on that specific n-range) if I run another instance of sr2sieve on one of the idle cores? I'm not really splitting the n-range but rather extending it.)
[/I]

[QUOTE]Like I said a few posts ago, sr2sieve is designed to split over many clients, each with its own range of p to sieve.[/QUOTE]I thought about that and apparently I don't really get how sieving works. I thought that BOTH the size of p-range and n-range determine the speed. If that was the case, then it would make sense to work in small increments of -P to gradually decrease the n-range.


Now I think it only depends on the p-range? In that case my attempt was indeed not good and it also explains the way sr2sieve works with the facotors.txt.


So ideally all physical cores would run one instance of sr2sieve on the same abcd input file but with different p-ranges. And removing candidates from the abcd file will only be done at the end of the whole process because the number of candidates doesn't really influence sieving speed?




edit: If n-range really has no noticeable impact on sieving speed, what bounds should I use? Lower bound obviously determined by the size of prime I want to find, but upper bound? Largest number I can see myself to LLR in the future? :)


And I used -u # to have the instances use their own factors, checkpoint and so on files. However, with -c they still all use the same sr2cache.bin, is that intentionally or should I specify different files using -C?

VBCurtis 2020-09-03 18:42

[QUOTE=bur;555900]Now I think it only depends on the p-range? In that case my attempt was indeed not good and it also explains the way sr2sieve works with the facotors.txt.
So ideally all physical cores would run one instance of sr2sieve on the same abcd input file but with different p-ranges. And removing candidates from the abcd file will only be done at the end of the whole process because the number of candidates doesn't really influence sieving speed?

edit: If n-range really has no noticeable impact on sieving speed, what bounds should I use? Lower bound obviously determined by the size of prime I want to find, but upper bound? Largest number I can see myself to LLR in the future? :)

And I used -u # to have the instances use their own factors, checkpoint and so on files. However, with -c they still all use the same sr2cache.bin, is that intentionally or should I specify different files using -C?[/QUOTE]

The speed of sr2sieve rises theoretically with the square root of n-range; so a 4x-larger range of exponents will cut sieve speed in half (but will find factors twice as fast, since there are 4x as many candidates in the sieve!). In practice, it's not exactly sqrt(n-range); you can experiment, or you can accept that rough guidance as good 'nuff.

When I'm sieving for myself, I select an upper bound a bit higher than the largest exponent I can see myself testing; If I think I might get to 3M, I'll sieve to 4M. You are correct that the number of candidates in the input file has little effect on sieve speed, once the really tiny candidates are taken out. That is, a 50% reduction in candidate pool will improve sieve speed noticeably, but a 5% reduction makes no difference.

I don't know what -c and -C do, sorry.

If you're sieving a single k-value, you should be using sr1sieve rather than sr2sieve. sr2 is faster for a file of multiple (more than 2) k's, while 1 or 2 k-values should be done individually with sr1sieve; in fact, srsieve2 (new software from rogue) may be yet faster and my advice may be stale. sr2sieve expects a file in format usable by LLR (-g flag from srfile), not abcd.

rogue 2020-09-03 21:39

You can use the same cache for each instance of sr2sieve that is sieving the same k, b, c, and n range, but different p range.

bur 2020-09-04 07:12

Thanks a lot for all the help so far!

I read about srsieve2, but wasn't sure if it's ready yet. Maybe rogue can say if for fixed k and n in the 3M-4M range which of sr1sieve, sr2sieve or srsieve2 is fastest?

[QUOTE]sr2sieve expects a file in format usable by LLR (-g flag from srfile), not abcd[/QUOTE]sr2sieve accepts abcd as input, I'm currently using it.

rogue 2020-09-04 12:12

[QUOTE=bur;555988]Thanks a lot for all the help so far!

I read about srsieve2, but wasn't sure if it's ready yet. Maybe rogue can say if for fixed k and n in the 3M-4M range which of sr1sieve, sr2sieve or srsieve2 is fastest?

sr2sieve accepts abcd as input, I'm currently using it.[/QUOTE]

If k is so large that you cannot build Legendre tables, then srsieve2 is probably faster. The Legendre logic fully coded and tested in srsieve2.

VBCurtis 2020-09-04 15:22

[QUOTE=bur;555988]sr2sieve accepts abcd as input, I'm currently using it.[/QUOTE]

Sorry for the typo, that was intended to be sr1sieve not sr2sieve in the part you quoted.

You should test the 3 programs yourself to see which is faster!

bur 2020-09-06 17:44

I will do a comparison later, first I want to finish sieving... I'm at 5e12 now and factors are coming in considerably slower, but still at about 3 min per factor. The range 3320000 <= n <= 4100000 is down to 14550 candidates.



Somewhere I read proth20 did one test on a gtx 1660 in 6-7 minutes (forgot the specifics of the candidate). So I will continue sieving for a while.

bur 2020-10-18 15:42

At PG forum it was said that sr1sieve is supposed to be faster than sr2sieve for fixed k, so I wanted to give it a try, but cannot find binary for windows. Deoes someone have a link? Thanks.


BTW, I tried srsieve2 on fixed k with magnitude 1E6 and n < 4E6 and only got 36,000 P/s with 70E12 < P < 71E13! With sr2sieve I get 10,000,000 and more. Is it really that slow for these parameters or is there an error somewhere?

rogue 2020-10-18 16:01

The latest Windows build of sr1sieve can be found in [URL="https://www.mersenneforum.org/showthread.php?t=15833"]this thread[/URL].

sr2sieve supports Legendre tables and srsieve2 does not have that feature yet. That is likely why it is faster. For k that are too large for Legendre tables, srsieve2 might be faster.


All times are UTC. The time now is 10:26.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.