![]() |
|
|
#45 |
|
Jun 2003
2·7·113 Posts |
Geoff, I found a bug in primes.c. I don't think, it affects the main sr2sieve version, since we do not do any bound checking there. (I think it occurs if a person is sieving beyond 2^48, range_size^2). This is probably what was slowing things down.
Attached is primes.c, look at the comments. (You can hold off on implementing the inverse method, since we can use the factoring of X, to speed things up. Look at my code in the comments)
Last fiddled with by Citrix on 2007-07-08 at 00:27 |
|
|
|
|
|
#46 |
|
Jun 2003
62E16 Posts |
ALso when you program does
Are we missing a possible prime by doing composite_table[i] = last_composite + max_prime; Shouldn't we look to see if last_composite is already greater than low_end_of_range or not? Code:
for (i = save_used_in_range; i < primes_used_in_range; i++)
{
max_prime = prime_table[i];
last_composite = (low_end_of_range / max_prime) * max_prime;
composite_table[i] = last_composite + max_prime;
/* We only care about odd composites since the sieve range only refers
to odd values. */
if (!(composite_table[i] & 1))
composite_table[i] += max_prime;
}
|
|
|
|
|
|
#47 |
|
Jun 2003
2·7·113 Posts |
Another thing slowing up the sieve is that if gcd(x,max_prime)>1, then you would do 100's of additions just to for the factor, since it will never have a possible solution.
Extra additions for each factor=range_size*x/max_prime. If max_prime=3 and x=720. Then 240*2^23=2013265920 extra additions per factor. So we should avoid these additions. Last fiddled with by Citrix on 2007-07-09 at 00:51 |
|
|
|
|
|
#48 | |
|
Mar 2003
New Zealand
13×89 Posts |
Quote:
I think the original sieve works fine when x is a power of 2, say 2^y, because last_composite is odd, and so the inverse of x mod last_composite is always found after at most y additions, and so overflow doesn't have to be checked as long as you don't expect to sieve beyond 2^(64-y). However I think the prime sieve is not the only problem you face. Sr2sieve doesn't do SPH, it doesn't use the Chinese Remainder Theorem, it just knocks out some congruence classes and sends what remains to BSGS. The fastest part of sr2sieve is the BSGS, most of the rest of the code is not particularly flash. But by sieving with smooth p-1 you end up minimising the time in BSGS, and maximising the time spent in the slow code. So this is the problem: If you restrict sieving to very smooth p-1 then you get bogged down in the slow code. But if you allow less smooth p-1 then the gains over normal sieving are not going to be great enough to justify the effort. The solution might seem to be to speed up the slow code, but that means a lot of work in all sorts of different areas that is only going to have a small benefit to the normal sieve. |
|
|
|
|
|
|
#49 | |
|
Jun 2003
2·7·113 Posts |
Quote:
|
|
|
|
|
|
|
#50 |
|
Jun 2003
2·7·113 Posts |
Geoff,
I have looked into couple of things regarding making srsieve faster. 1) I have found a way to implement pollard rho with only ~2*sqrt(x) multiplications for each prime. There are 2 problems though a) We will not able to combine the steps so each k will be done separately. b) There is no way to limit the sieve to only values under 50M, unless you use SPH. So basically if time saved from writing things in memory and reading them later is more than doing all k's separately then this method will be useful, otherwise there is no point implementing it. It might help on machines that have low memory, but the BSGS loop takes little memory from what I understand and the initial sieve takes the most memory. The other place where this could be useful is in sr1sieve, but we will have to use SPH. 2) The only other way we can speed up things is using SPH with BSGS. If I write the code for Chinese Remainder Theorem and some of the other things that the program will need, do you think we can make the program faster? ![]()
Last fiddled with by Citrix on 2007-08-01 at 03:45 |
|
|
|
|
|
#51 | |
|
Mar 2003
New Zealand
13·89 Posts |
Quote:
Probably the quickest way to get a boost in sieve productivity in the short term would be to figure out how to compile srsieve or jjsieve for 64-bit Windows machines. |
|
|
|
|
|
|
#52 | |
|
Jun 2003
2×7×113 Posts |
Quote:
Maybe you can create an .obj or .lib or .dll file that I can use with my program that has pre-compiled functions for modular multiplications. Basically you will be writing a fast 64 bit library. I think several projects can benefit from this. (This way I won't have to deal with any low level stuff) edit On second thought, there is no point re-writing the code your program uses. I will just re-write the initial Sieve of Eratosthenes and the corresponding SPH/ CRT code needed. I think we can keep the rest of the code the same, there is no need to write the rest from scratch. ![]()
Last fiddled with by Citrix on 2007-08-04 at 20:49 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Windows 10 in Ubuntu, good idea, bad idea, or...? | jasong | jasong | 8 | 2017-04-07 00:23 |
| Idea (as always) | Dubslow | Forum Feedback | 7 | 2016-12-02 14:26 |
| Advantage of lattice sieve over line sieve | binu | Factoring | 3 | 2013-04-13 16:32 |
| idea ? | science_man_88 | Homework Help | 0 | 2010-04-23 16:33 |
| An Idea | ThomRuley | Lone Mersenne Hunters | 5 | 2003-07-15 05:51 |