 mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Sierpinski/Riesel Base 5 (https://www.mersenneforum.org/forumdisplay.php?f=54)
-   -   A multiple k/c sieve for Sierpinski/Riesel problems (https://www.mersenneforum.org/showthread.php?t=5785)

 geoff 2006-04-27 02:10

A multiple k/c sieve for Sierpinski/Riesel problems

I have started writing a program to sieve for these projects, it is far from complete but the core sieving algorithm is working and seems to be producing correct results. If you want to try it out I will put the source code here: [url=http://www.geocities.com/g_w_reynolds/srsieve/]srsieve[/url]

I haven't implemented extended arithmetic yet so it is limited to sieving up to about 4 billion on 32 bit machines.

It uses a baby steps giant steps algorithm to sieve any number of candidate sequences at once, both Sierpinski and Riesel.

It is already much faster than NewPGen when sieving with 4 or more candidates, although some of that is probably because of the 32bit arithmetic. Here are a couple of timings for sieving on my P3/450 with base 5 Sierpinski candidates from 1 < p < 100,000,000 and 50,000 < n < 100,000:

1 candidate: 14 minutes (NewPGen 18 minutes)
4 candidates: 28 minutes (NewPGen 55 minutes)

 masser 2006-04-27 02:47

1 Attachment(s)
Wow! That siever sounds awesome. I can't wait to try it.

On the topic of sieving, I noticed you reserved a k value I had previously done some sieving on. Here's the sieve file - hopefully it's useful to you.

 Citrix 2006-04-27 05:37

why not get proth_sieve modified. changing base 2 to base 5 won't take long.:popcorn:

 Greenbank 2006-04-27 09:15

Good work Geoff and thank you for making the source available. I'll take a look at it and see how it compares to proth_sieve (hopefully we can improve both programs!).

 Greenbank 2006-04-27 13:15

[QUOTE=Citrix]why not get proth_sieve modified. changing base 2 to base 5 won't take long.:popcorn:[/QUOTE]

It would take quite a while, an awful lot of proth_sieve was written assuming that it was sieving base 2 numbers. Plus proth_sieve can't sieve anything below p=10^9.

 axn 2006-04-27 20:08

[QUOTE=Greenbank]It would take quite a while, an awful lot of proth_sieve was written assuming that it was sieving base 2 numbers.[/QUOTE]
That's a problem. But it might be worth it.
[QUOTE=Greenbank]Plus proth_sieve can't sieve anything below p=10^9.[/QUOTE]
Not so much of a problem. We can always get ourselves to 10^9 one k at a time (NewPGen or some such program).

REQUEST#1:- Could someone make a Win32 executable for Geoff's pgm (Athlon 32 build, if possible)? I have no way of compiling it.

REQUEST#2:- Geoff, can your program handle both + & - sieve simultaneously. If not, can you modify it to do so?

 rogue 2006-04-28 12:18

Good work on your sieve. If you can do your invmod entirely to the FPU, you should be able to double the speed of the sieve (at the minimum). I'll play with something this weekend.

Another point I have is that you will have to modify it to handle p > 32 bit in order for it to be truly useful on this project. Consider using Montgomery to handle your mods.

 rogue 2006-04-28 23:18

[QUOTE=rogue]Good work on your sieve. If you can do your invmod entirely to the FPU, you should be able to double the speed of the sieve (at the minimum). I'll play with something this weekend.

Another point I have is that you will have to modify it to handle p > 32 bit in order for it to be truly useful on this project. Consider using Montgomery to handle your mods.[/QUOTE]

Strike that. I am mistaken. The time in invmod is negligable to If it is possible to have hsize set to a power of 2, then do so, because then you can use & instead of / when building your hash table. You also should consider an ASM routine to handle mulmods and powmods. This combination almost triples the speed. You should also consider using 64-bit p as it won't take long before p > 2^32.

My timings on
-n 5000 -N 10000 -P 10000000 "1396*5^n-1" "5114*5^n+1"
are:
13.5 seconds (original code) to 5.9 seconds (with these changes).

 axn 2006-04-29 11:46

@geoff: Outstanding! I am right now sieveing all 94 Sierpinski k's for n <= 2,000,000. Getting about 8kp/s. At this rate I'll be done with p < 2^32 in under a week.

@rogue: Can you post the modified source code and/or post an executable? That sound's interesting.

 rogue 2006-04-29 23:41

e-mail sent to axn1. geoff, get in contact with him if you don't mind.

BTW, I found a bug in next_bit(). This loop:

for (i++; B[i] == 0; i++)
;

can lead to a SEGFAULT (as it did with me). I modified it to:

for (i++; B[i] == 0; i++)
if (i > bit_range)
break;

where bit_range = sieve_range >> 5. sieve_range is passed on the call.

I also suggest that you use shifts instead of % and / in bitmap.h. This is what I have:

[code]
#define UINT_BIT 32
#define UINT_SHIFT 5 // because 2^5 = 32

extern inline int set_bit(unsigned int *B, int bit)
{

B += (bit >> UINT_SHIFT);
// B += bit / UINT_BIT;
mask = 1 << (bit & (UINT_BIT - 1));
//mask = 1 << (bit % UINT_BIT);
ret = (mask & *B) != 0;

return ret;
}

extern inline int clear_bit(unsigned int *B, int bit)
{

B += (bit >> UINT_SHIFT);
// B += bit / UINT_BIT;
mask = 1 << (bit & (UINT_BIT - 1));
//mask = 1 << (bit % UINT_BIT);
ret = (mask & *B) != 0;

return ret;
}

extern inline int test_bit(const unsigned int *B, int bit)
{

B += (bit >> UINT_SHIFT);
// B += bit / UINT_BIT;
mask = 1 << (bit & (UINT_BIT - 1));
//mask = 1 << (bit % UINT_BIT);

return ((mask & *B) != 0);
}

extern inline unsigned int first_bit(const unsigned int *B)
{
int i;

for (i = 0; B[i] == 0; i++)
;

return i*UINT_BIT + ffs(B[i]) - 1;
}

extern inline unsigned int next_bit(const unsigned int *B, int bit, int sieve_range)
{

bit_range = sieve_range >> UINT_SHIFT;
i = (bit >> UINT_SHIFT);
//i = bit / UINT_BIT;
mask = ~0U << (bit & (UINT_BIT - 1));
//mask = ~0U << (bit % UINT_BIT);

return i*UINT_BIT + ffs(B[i] & mask) - 1;

for (i++; B[i] == 0; i++)
if (i > bit_range)
break;

return i*UINT_BIT + ffs(B[i]) - 1;
}
[/code]

In bsgs.c, calculate hsize in init_hashtable with
[code]
int tsize;
tsize = firstprime(size*HASH_MAIN_FACTOR);
hsize = 0x01;
while (hsize < tsize)
hsize <<= 1;
[/code]

Then in lookup() and insert(), use
[code]
int slot = bj & (hsize - 1);
[/code]

This will improve performance for small p more than anything else. FYI, with the changes I've made to date, I get around 650 primes/sec across for n from 5000 to 2000000 for the 94 Sierpinski base 5 candidates, which is close to twice as fast of the original code on my box.

One thing that is misleading is that my code is 64-bit, so the limit is 2^63, not 2^32, so it is a little harder to compare. My sieving rate would be much higher if I wrote separate 32-bit only code, but the way I view it is that so little time is spent in the 32-bit code that it isn't worth coding special routines for it. I have not fully tested thr 64-bitness of the code. My additions are fine, but I can't speak for the original code. I would have to look at any references to "int" and replace some of them with "long", or "uint64_t".

 masser 2006-04-30 01:20

1 Attachment(s)
[QUOTE=axn1]@geoff: Outstanding! I am right now sieveing all 94 Sierpinski k's for n <= 2,000,000. Getting about 8kp/s. At this rate I'll be done with p < 2^32 in under a week.

@rogue: Can you post the modified source code and/or post an executable? That sound's interesting.[/QUOTE]

I worked a little on something similar this week. For all of the Sierpinski k's that haven't been tested to n=100000 (35 sequences), I ran Geoff's siever up to the current sieve limit. I'm including the file here - I'm going to test from 56000 to 60000. There are only 15000 candidates; as a group we could probably test these values fairly quickly. Maybe we should start a new thread to do n-range reservations....

All times are UTC. The time now is 11:31.