 mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Computer Science & Computational Number Theory (https://www.mersenneforum.org/forumdisplay.php?f=116)
-   -   Efficient storage of all primes up to some n (https://www.mersenneforum.org/showthread.php?t=24417)

 LaurV 2019-05-10 06:37

[QUOTE=hansl;516362]Is there a formula to get the modular classes given the nth primorial?[/QUOTE]
eulerphi()

 hansl 2019-05-10 06:56

[QUOTE=LaurV;516363]eulerphi()[/QUOTE]

Excellent! I just updated my code and results.

 xilman 2019-05-10 08:21

[QUOTE=hansl;516309]So, what if we just store the gap between adjacent primes?

Anyways, we could extend the prime gap representation further, by splitting into ranges based on the bitsize of the maximum gaps in the range. I haven't yet attempted to estimate the size for such a scheme, but it might save a few more percent in the total size.[/QUOTE]
Huffman coding of gaps.

 ewmayer 2019-05-10 19:39

If one's aim is to support e.g. get_nth_prime(n) or "quickly enumerate all primes up to some limit"-style functionality, the way my code does this is:

1. Use a simple sieve to generate candidates not divisible by really small primes;
2. Feed surviving candidates - batches of 4-8 here tend to allow for good hiding of integer MUL latencies - to a base-2 Fermat PRP routine;
3. Do a fast lookup of the surviving candidates in a precomputed table of known base-2 Fermat pseudoprimes;
4. If a candidate does not appear in said table, it's prime.

So there's still a table involved, but it's *much* smaller than any needed for explicit prime storage. Here is the associated commentary in my code:
[code]/* There are 10403 ( = 101*103) base-2 Fermat pseudoprimes < 2^32.
other tables related to the pseudoprimes of various types].
We split these into 2 subsets -  those divisible by 3 or 5 and  those not.

Some General Observations:

The largest base-2 Fermat pseudoprime < 2^32 is 4294901761 = 193*22253377 = 2^32 - 2^16 + 1, which is reminiscent
of the 64-bit (genuine) prime 2^64 - 2^32 + 1 used by Nick Craig-Wood in his pure-integer Mersenne-mod DWT-based
convolution code.

The only twin pseudoprimes < 2^32 are 4369 (which = 17*257, the product of the Fermat primes F2 and F3) and 4371.
Similarly to F2*F3, the product F3*F4 = 16843009 is a base-2 Fermat pseudoprime. Other products of Fermat primes
are "near pseudoprimes" in the sense that their base-2 Fermat residue is a power of 2, e.g. for n=17*65537,
2^(n-1)%n = 2^16.

The largest (gap/2) between adjacent pseudoprimes (either in the merged length-10403 table or the f2psp[] one below -
the number is set by two adjacent elements of the latter - is 4199775, requiring 3 bytes to store, so no significant
compactification via a difference table is possible, as there is for the primes.
*/[/code]

 ldesnogu 2019-05-10 21:47

Given the speed computers have reached, I have stopped using any list, and I just run Kim Walisch [URL="https://github.com/kimwalisch/primesieve"]primesieve[/URL]. It can sieve up to 10^10 in [URL="http://ntheory.org/sieves/benchmarks.html"]less than 2s[/URL].

 ewmayer 2019-05-10 23:16

[QUOTE=ldesnogu;516428]Given the speed computers have reached, I have stopped using any list, and I just run Kim Walisch [URL="https://github.com/kimwalisch/primesieve"]primesieve[/URL]. It can sieve up to 10^10 in [URL="http://ntheory.org/sieves/benchmarks.html"]less than 2s[/URL].[/QUOTE]

Perhaps posting some of the key algorithmic methods said code uses to achieve these speeds would be useful to the OP (and the rest of us.)

 ldesnogu 2019-05-11 06:53

[QUOTE=ewmayer;516435]Perhaps posting some of the key algorithmic methods said code uses to achieve these speeds would be useful to the OP (and the rest of us.)[/QUOTE]
Kim wrote a page that will hopefully help:

[url]https://github.com/kimwalisch/primesieve/blob/master/doc/ALGORITHMS.md[/url]

 hansl 2019-05-11 09:00

[QUOTE=ewmayer;516419]If one's aim is to support e.g. get_nth_prime(n) or "quickly enumerate all primes up to some limit"-style functionality, the way my code does this is:

1. Use a simple sieve to generate candidates not divisible by really small primes;
2. Feed surviving candidates - batches of 4-8 here tend to allow for good hiding of integer MUL latencies - to a base-2 Fermat PRP routine;
3. Do a fast lookup of the surviving candidates in a precomputed table of known base-2 Fermat pseudoprimes;
4. If a candidate does not appear in said table, it's prime.

So there's still a table involved, but it's *much* smaller than any needed for explicit prime storage. Here is the associated commentary in my code:
[code]/* There are 10403 ( = 101*103) base-2 Fermat pseudoprimes < 2^32.
other tables related to the pseudoprimes of various types].
We split these into 2 subsets -  those divisible by 3 or 5 and  those not.

Some General Observations:

The largest base-2 Fermat pseudoprime < 2^32 is 4294901761 = 193*22253377 = 2^32 - 2^16 + 1, which is reminiscent
of the 64-bit (genuine) prime 2^64 - 2^32 + 1 used by Nick Craig-Wood in his pure-integer Mersenne-mod DWT-based
convolution code.

The only twin pseudoprimes < 2^32 are 4369 (which = 17*257, the product of the Fermat primes F2 and F3) and 4371.
Similarly to F2*F3, the product F3*F4 = 16843009 is a base-2 Fermat pseudoprime. Other products of Fermat primes
are "near pseudoprimes" in the sense that their base-2 Fermat residue is a power of 2, e.g. for n=17*65537,
2^(n-1)%n = 2^16.

The largest (gap/2) between adjacent pseudoprimes (either in the merged length-10403 table or the f2psp[] one below -
the number is set by two adjacent elements of the latter - is 4199775, requiring 3 bytes to store, so no significant
compactification via a difference table is possible, as there is for the primes.
*/[/code][/QUOTE]
Thanks for the info. Which program is this for? Sorry, I'm still a bit new here and haven't memorized who's who of program authors yet. (There's so many programs!)

It seems like a fermat test would slow things down a bit though? What does the complexity look like for a good powm algorithm?
I've mostly just played a little with GMP so far, and started picking up a bit of PARI lately.

[quote=Idesnogu]
Given the speed computers have reached, I have stopped using any list, and I just run Kim Walisch primesieve. It can sieve up to 10^10 in less than 2s.[/quote]
That program is pretty impressive! Honestly I don't really have a specific reason for trying to store all primes from my original post, just trying to get some ideas of roughly how much data we are talking, and learn about how various algorithms like this work. I should try actually implementing some form of wheel factorization/sieve and see how it compares.

One thing I have started wondering about, is if there is some way to further compact a very large wheel, maybe some sort of recursive implementation could use smaller wheels to save space in the larger one? Not sure if that makes any sense, still thinking it over...

 hansl 2019-05-11 09:40

[QUOTE=ewmayer;516419][code]/* There are 10403 ( = 101*103) base-2 Fermat pseudoprimes < 2^32.
other tables related to the pseudoprimes of various types].[/code][/QUOTE]
The link is dead now, btw. I was able to view it in wayback machine though.

Playing around some more in PARI, I decided to write a pseudoprime counting function:
[code]
fermat_test(b,p)=Mod(b,p)^(p-1)==1

pseudoprimepi(b,limit)={
my(prev=4,count=0);
forprime(p=3,limit,forstep(n=prev,p-1,2,if(fermat_test(b,n),count++));prev=p+2);
count
}
[/code]

Results:
[code]
? pseudoprimepi(2,2^32)
%70 = 10403
? ##
*** last result computed in 32min, 5,043 ms.
[/code]
Well, at least I got the same value as above. But man, PARI is slow for enumerating primes like this.

Note: it skips even numbers so the count is not technically correct if counting base 3 for example. Here's a version that should count all of them, but is slightly slower:
[code]
pseudoprimepi(b,limit)={
my(prev=4,count=0);
forprime(p=3,limit,for(n=prev,p-1,if(gcd(b,n)==1&&fermat_test(b,n),count++));prev=p+1);
count
}
[/code]
The "b" in gcd call could be replaced with a wheel primorial (like 30) to narrow down relevant ones for the method that ewmayer talks about.

 Dr Sardonicus 2019-05-12 13:25

[QUOTE=ewmayer;516419]If one's aim is to support e.g. get_nth_prime(n) or "quickly enumerate all primes up to some limit"-style functionality, the way my code does this is:

1. Use a simple sieve to generate candidates not divisible by really small primes;
2. Feed surviving candidates - batches of 4-8 here tend to allow for good hiding of integer MUL latencies - to a base-2 Fermat PRP routine;
3. Do a fast lookup of the surviving candidates in a precomputed table of known base-2 Fermat pseudoprimes;
4. If a candidate does not appear in said table, it's prime.

So there's still a table involved, but it's *much* smaller than any needed for explicit prime storage. Here is the associated commentary in my code:
[code]/* There are 10403 ( = 101*103) base-2 Fermat pseudoprimes < 2^32.
other tables related to the pseudoprimes of various types].
The only twin pseudoprimes < 2^32 are 4369 (which = 17*257, the product of the Fermat primes F2 and F3) and 4371.
<snip>
*/[/code][/QUOTE]

The paper [url=https://www.maths.lancs.ac.uk/jameson/carpsp.pdf]Carmichael numbers and pseudoprimes[/url] may be instructive.

In particular, by the result labelled 2.1, both 17 and 257 are == 1 (mod 16) and both divide 2^16 - 1, so their product is automatically 2-psp.

Statistics for various types of pseudoprimes up to 10[sup]k[/sup] for k = 9 to 14 are compiled [url=http://ntheory.org/pseudoprimes.html]here[/url].

Tables of base-2 pseudoprimes up to 2[sup]64[/sup] (list of numbers, list of factorizations, annotated list with factorizations and indicating Carmichael numbers, all compressed, .txt.bz2) may be downloaded from [url=http://www.cecm.sfu.ca/Pseudoprimes/index-2-to-64.html]here[/url].

 ewmayer 2019-05-12 19:26

FYI, It seems the pseudoprime page I linked in my code comment has moved to [url]http://www.numericana.com/answer/pseudo.htm#pseudoprime[/url] .

All times are UTC. The time now is 08:45.