20070708, 00:25  #45 
Jun 2003
11^{2}·13 Posts 
Geoff, I found a bug in primes.c. I don't think, it affects the main sr2sieve version, since we do not do any bound checking there. (I think it occurs if a person is sieving beyond 2^48, range_size^2). This is probably what was slowing things down.
Attached is primes.c, look at the comments. (You can hold off on implementing the inverse method, since we can use the factoring of X, to speed things up. Look at my code in the comments) Last fiddled with by Citrix on 20070708 at 00:27 
20070708, 00:49  #46 
Jun 2003
625_{16} Posts 
ALso when you program does
Are we missing a possible prime by doing composite_table[i] = last_composite + max_prime; Shouldn't we look to see if last_composite is already greater than low_end_of_range or not? Code:
for (i = save_used_in_range; i < primes_used_in_range; i++) { max_prime = prime_table[i]; last_composite = (low_end_of_range / max_prime) * max_prime; composite_table[i] = last_composite + max_prime; /* We only care about odd composites since the sieve range only refers to odd values. */ if (!(composite_table[i] & 1)) composite_table[i] += max_prime; } 
20070708, 01:07  #47 
Jun 2003
11^{2}·13 Posts 
Another thing slowing up the sieve is that if gcd(x,max_prime)>1, then you would do 100's of additions just to for the factor, since it will never have a possible solution.
Extra additions for each factor=range_size*x/max_prime. If max_prime=3 and x=720. Then 240*2^23=2013265920 extra additions per factor. So we should avoid these additions. Last fiddled with by Citrix on 20070709 at 00:51 
20070710, 00:36  #48  
Mar 2003
New Zealand
13·89 Posts 
Quote:
I think the original sieve works fine when x is a power of 2, say 2^y, because last_composite is odd, and so the inverse of x mod last_composite is always found after at most y additions, and so overflow doesn't have to be checked as long as you don't expect to sieve beyond 2^(64y). However I think the prime sieve is not the only problem you face. Sr2sieve doesn't do SPH, it doesn't use the Chinese Remainder Theorem, it just knocks out some congruence classes and sends what remains to BSGS. The fastest part of sr2sieve is the BSGS, most of the rest of the code is not particularly flash. But by sieving with smooth p1 you end up minimising the time in BSGS, and maximising the time spent in the slow code. So this is the problem: If you restrict sieving to very smooth p1 then you get bogged down in the slow code. But if you allow less smooth p1 then the gains over normal sieving are not going to be great enough to justify the effort. The solution might seem to be to speed up the slow code, but that means a lot of work in all sorts of different areas that is only going to have a small benefit to the normal sieve. 

20070710, 07:08  #49  
Jun 2003
11000100101_{2} Posts 
Quote:


20070801, 03:11  #50 
Jun 2003
1573_{10} Posts 
Geoff,
I have looked into couple of things regarding making srsieve faster. 1) I have found a way to implement pollard rho with only ~2*sqrt(x) multiplications for each prime. There are 2 problems though a) We will not able to combine the steps so each k will be done separately. b) There is no way to limit the sieve to only values under 50M, unless you use SPH. So basically if time saved from writing things in memory and reading them later is more than doing all k's separately then this method will be useful, otherwise there is no point implementing it. It might help on machines that have low memory, but the BSGS loop takes little memory from what I understand and the initial sieve takes the most memory. The other place where this could be useful is in sr1sieve, but we will have to use SPH. 2) The only other way we can speed up things is using SPH with BSGS. If I write the code for Chinese Remainder Theorem and some of the other things that the program will need, do you think we can make the program faster? Last fiddled with by Citrix on 20070801 at 03:45 
20070803, 03:05  #51  
Mar 2003
New Zealand
13×89 Posts 
Quote:
Probably the quickest way to get a boost in sieve productivity in the short term would be to figure out how to compile srsieve or jjsieve for 64bit Windows machines. 

20070803, 07:23  #52  
Jun 2003
11^{2}·13 Posts 
Quote:
Maybe you can create an .obj or .lib or .dll file that I can use with my program that has precompiled functions for modular multiplications. Basically you will be writing a fast 64 bit library. I think several projects can benefit from this. (This way I won't have to deal with any low level stuff) edit On second thought, there is no point rewriting the code your program uses. I will just rewrite the initial Sieve of Eratosthenes and the corresponding SPH/ CRT code needed. I think we can keep the rest of the code the same, there is no need to write the rest from scratch. Last fiddled with by Citrix on 20070804 at 20:49 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Windows 10 in Ubuntu, good idea, bad idea, or...?  jasong  jasong  8  20170407 00:23 
Idea (as always)  Dubslow  Forum Feedback  7  20161202 14:26 
Advantage of lattice sieve over line sieve  binu  Factoring  3  20130413 16:32 
idea ?  science_man_88  Homework Help  0  20100423 16:33 
An Idea  ThomRuley  Lone Mersenne Hunters  5  20030715 05:51 