![]() |
Faster sieving with NewPGen and srsieve
Here is a way to increase sieve throughput that works with NewPGen (for bases other than 2) and srsieve. Take this base 5 sequence that I am currently sieving as an example:
52922*5^n-1, 50,000 < n < 131072 It happens that all of the remaining terms in the sieve have n that are multiples of 4, and so I could instead sieve the following sequence and find exactly the same factors (since 5^4=625): 52922*625^n-1, 12500 < n < 32768 The difference is that the sieve range is 1/4 the size and so is quite a bit faster, roughly twice as fast with NewPGen, a bit less with srsieve because the benefits of the base 5 mulmod are lost. In general, if it is possible to write n in k*b^n+c as n=q*m+r for fixed q,r then I think it will be possible to increase sieve throughput by a factor of about sqrt(q) and reduce bitmap memory use by a factor of (q-1)/q. Version 0.2.0 of srsieve implements the memory reduction, but I am still working on the increased throughput part :-). For the remaining sequences of the base 5 project it looks like q=4 is a common case, which might mean throughput can be doubled and memory use reduced to 25% of the current requirements. axn1, could you run 'srfile -c' (version 0.2.0) on the current sieve file? It will print the values n=q*m+r for each sequence and give an effective q value for the file as a whole. If this pans out, it might be good to list the q values along with the Nash weights of the remaining sequences, as finding a prime for a sequence with a small q value would increase the speed of the sieve more than one with a large q. |
1 Attachment(s)
See the attached file for the stats.
Is it possible to split the sieve files into multiple pieces to maximise efficiency? |
Can you take this idea further e.g. if you find that for a particular k all the n values are either 1 or 2 module 12 then you can split it into the sequences (k*5)*(5^12)^n-1 and (k*25)*(5^12)^n-1. I think this is how proth-sieve does things.
|
[QUOTE=axn1]See the attached file for the stats.[/QUOTE]
Thanks, so the q values are 1,2,4,6,8,12,16,24, with most being 4 or larger. [QUOTE]Is it possible to split the sieve files into multiple pieces to maximise efficiency?[/QUOTE] Only for a limited number of the sequences. The q values are too large for most of them, using a base of 5^24 would need changes to the srsieve code anyway. [QUOTE=konrad127123]Can you take this idea further e.g. if you find that for a particular k all the n values are either 1 or 2 module 12 then you can split it into the sequences (k*5)*(5^12)^n-1 and (k*25)*(5^12)^n-1. I think this is how proth-sieve does things.[/QUOTE] Yes that is what I hope to do. All of the remaining base 5 sequences can be broken up into equivalent sequences with base 5^48 since lcm(1,2,4,6,8,12,16,24)=48. According to the data from axn1 the theoretical gain will be by a factor of about sqrt(4.79), but I am not sure how much overhead will be added, it may end up less than that. |
[QUOTE=konrad127123]Can you take this idea further e.g. if you find that for a particular k all the n values are either 1 or 2 module 12 then you can split it into the sequences (k*5)*(5^12)^n-1 and (k*25)*(5^12)^n-1. I think this is how proth-sieve does things.[/QUOTE]
I have finished a prototype that does something like this. A sequence k*b^n+c with n=qm+r for fixed q,r is split into sequences k*b^(iq+r)*(b^tq)^m+c where 0 <= i < t. For the base 5 sieve all the sequences end up in the form k*b^d*(b^48)^n+c where 0 <=d < 48. It works out quite a bit faster when there are only a couple of sequences in the sieve, but the more sequences I add the slower it gets, and when there are 350 sequences it actually runs at only 1/4 the speed it did before I made any changes :-( The problem seems to be that my modular inverse function is extremely slow. I will need to find a faster way to calculate -c/(k*b^d) (mod p) before this idea will work out. |
[QUOTE=geoff]I have finished a prototype that does something like this. A sequence k*b^n+c with n=qm+r for fixed q,r is split into sequences k*b^(iq+r)*(b^tq)^m+c where 0 <= i < t.
For the base 5 sieve all the sequences end up in the form k*b^d*(b^48)^n+c where 0 <=d < 48.[/QUOTE] I am not sure I understand the logic -- are you saying that for each k, you'll sieve 48/q (q=24,16,12,etc..) separate sequences?! Wouldn't that lead to an explosion of sequences, especially for the lower q's? In that case 5^12 (or even 5^4) might be better than 5^48. |
[QUOTE=axn1]I am not sure I understand the logic -- are you saying that for each k, you'll sieve 48/q (q=24,16,12,etc..) separate sequences?! Wouldn't that lead to an explosion of sequences, especially for the lower q's? In that case 5^12 (or even 5^4) might be better than 5^48.[/QUOTE]
You may be right :-) I thought that in principle it shouldn't matter if you end up with 48/q sequences since the sieve range has been reduced to 1/48th of the original size. In practice is does seem to matter, although I can't quite work out why. The modular inverse was not a problem after all, instead of calculating 1/(b^d) for each d, it is much faster to calculate 1/b once then calculate (1/b)^d for each d. |
Just a note that the reason that a base less than 5^48 was faster was due to inefficiencies in my code that affected the larger bases more than the small ones. Now that I have fixed the worst of these, the speed increase seems to be converging to the expected sqrt(4.79) figure.
|
Have you tried using a value of Q larger than 48? It could possibly increase the sieve speed further.
|
[QUOTE=konrad127123]Have you tried using a value of Q larger than 48? It could possibly increase the sieve speed further.[/QUOTE]
I hadn't until you asked the question, then I thought I had better try it before I reply :-) The method srsieve uses to split up the base 5 sequences into base 5^Q subsequences is to express k*5^n+/-1 as k*5^(qm+r)+/-1 for fixed q,r. If q is maximised then for the Base 5 project sieve that leads to a maximum Q=48 since every remaining term happens to have a q that divides 48. However it is possible to apply that same method to the base 5^48 sequences and get subsequences in even larger bases. I have added this capability to srsieve 0.3.3 for experiment. If you specify --limit-base=X with a value of X greater than the Q selected by srsieve in the first pass, it will continue splitting up the sequences until either the base reaches b^X or the base cannot be increased further. For example with the Base 5 project sieve you could use multiples of 48 for X to get higher bases. I haven't tested this much, but none of the few cases I tried resulted in a speedup. If you find a case where using --limit-base=X results in better throughput than the default, I would be interested to hear. There is one good reason why increasing Q will eventually result in reduced throughput: The work in the sieve has to be spread evenly between the baby steps and the giant steps to get good performance, but as the number of sequences increases the number of giant steps has to be reduced to keep that balance. Eventually only one giant step can be done, and it is not possible to re-balance by reducing the giant steps further. With the Base 5 project sieve we are already down to 4 giant steps with Q=48. |
[QUOTE=geoff]I hadn't until you asked the question, then I thought I had better try it before I reply :-)
The method srsieve uses to split up the base 5 sequences into base 5^Q subsequences is to express k*5^n+/-1 as k*5^(qm+r)+/-1 for fixed q,r. If q is maximised then for the Base 5 project sieve that leads to a maximum Q=48 since every remaining term happens to have a q that divides 48.[/QUOTE] This is probably a silly question, but in your code, say you have a k value such that the corresponding n values are all 2 or 18 modulo 48. The corresponding q value would then be 16. Suppose we use Q=48 as normal, your code will then sieve the 2 and 18 modulo 48 subsequences. Will it also sieve the 34 modulo 48 subsequence since 34 % 16 =2? I'm guessing the answer is no (as it should be :)), but I couldn't work it out from the quick glance at the code that I had. I'm away for a while now, so won't have a chance to play with the variable Q option for a the next fortnight or so. |
| All times are UTC. The time now is 09:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.