![]() |
![]() |
#1 |
Apr 2012
5 Posts |
![]()
From http://en.wikipedia.org/wiki/Prime_gap:
The prime gap between two succesive prime is always even (except for 2-3) and is always <512 for small-medium number (up to 304G). We can therefore have a table of primes in which each prime occupy 1 unsigned byte. The following trial division code generates a table of the first 203 280 219 prime-halfgap (there are 203280221 primes up to 2^32), starting from the halfgap(5-3)=1. It is not optimized (it will take up to 1hour). The .dat generated file has MD5=004ECB69C49B63D3A6CEEA0E1B94317B Code:
#include "stdio.h" #include "time.h" int main(void) { time_t start,end; time(&start); unsigned int* smallprime=new unsigned int [7000];//about 7000 primes<65536 unsigned int* pt; unsigned int* pt2; unsigned char* gap=new unsigned char [210000000];//about 210000000 primes<2^32 unsigned int max0; unsigned int max=4294967293;//4294967293 = 2^32 - 3 ; change this to obtain smaller dat file int sq=0;//to stop checking if prime^2>number unsigned int x,flag,t; int smallprimecount=1; unsigned int currprime; smallprime[0]=3;//counting only odd primes; FILE* datfile; unsigned int number; pt=smallprime; t=*pt;t*=t; for (number=5;number<65535;number+=2) { if (t==number) { pt++; t=*pt;t*=t; } else { flag=0; for(pt2=smallprime;pt2<=pt;pt2++) { if ((number%*pt2)==0) { flag=1; break; } } if(flag==0) { smallprime[smallprimecount]=number; smallprimecount++; } } } time(&end); printf("\n%d primes up to %d (t=%d sec)\n",smallprimecount+1,65536,end-start); currprime=smallprime[smallprimecount-1]; max0=currprime;max0*=max0;//we need this to avoid overflow with t (see later) if (max0>max) max0=max; flag=smallprimecount-1;//flag=temp var for(x=0;x<flag;x++) gap[x]=(unsigned char)((smallprime[x+1]-smallprime[x])/2); for(number=65537;number<max0;number+=2) { if (t==number) { pt++; t=*pt;t*=t; } else { flag=0; for(pt2=smallprime;pt2<=pt;pt2++) { if ((number%*pt2)==0) { flag=1; break; } } if(flag==0) { gap[x]=(unsigned char)((number-currprime)/2); currprime=number; x++; } } } //check for primes near 2^32 for(number=max0+2;number<=max;number+=2) { flag=0; for(pt2=smallprime;pt2<=pt;pt2++) { if ((number%*pt2)==0) { flag=1; break; } } if(flag==0) { gap[x]=(unsigned char)((number-currprime)/2); currprime=number; x++; } } time(&end); printf("\n%d primes up to %u (t=%d sec)\n\nWriting %d half-gaps (wait)\n",x+2,max,end-start,x); datfile=fopen("gapprime.dat","wb"); fwrite(gap,1,x,datfile); fclose(datfile); delete [] smallprime; delete [] gap; time(&end); printf("\nDone! (t=%d sec)\n", end-start); getchar(); return 1; } Code:
#include "stdio.h" #include "time.h" int main(void) { time_t start,end; int x; printf("Loading primes... (t=0sec)");time(&start); FILE* datfile=fopen("gapprime.dat","rb"); unsigned char* hgap=new unsigned char [203280219]; fread(hgap,1,203280219,datfile); fclose(datfile); unsigned int* prime=new unsigned int [203280221]; prime[0]=2;prime[1]=3; for(x=0;x<203280219;x++) prime[x+2]=prime[x+1]+2*((unsigned int)(hgap[x])); delete [] hgap;//we no longer need gaps time(&end); printf("\nPrimes loaded (t=%dsec, n.primes=%d)\n\n",end-start,x+2); printf("%u\n%u\n%u\n%u\n%u\n%u\n", prime[0],prime[2],prime[203280220],prime[99999999],prime[50847533],prime[50847534]); delete [] prime; getchar(); return 1; } |
![]() |
![]() |
![]() |
#2 | |
"Bob Silverman"
Nov 2003
North of Boston
22×5×373 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#3 |
"Forget I exist"
Jul 2009
Dumbassville
26·131 Posts |
![]()
my best idea so far is a forstep loop only hitting numbers that are 1 or 5 mod 6 and writing out only the status of if they are prime as 1 or 0, which if my math is correct should store it in under 171 MB, and that's without RLE etc.
Last fiddled with by science_man_88 on 2012-05-12 at 14:46 |
![]() |
![]() |
![]() |
#4 |
Aug 2006
3×1,993 Posts |
![]()
In the extreme, they can be packed into 0 bytes by recomputing as needed. Different levels of compression are good for different things. But the OP's method is actually in use in at least one major system, so it's certainly useful for some tasks.
|
![]() |
![]() |
![]() |
#5 |
Einyen
Dec 2003
Denmark
2×1,669 Posts |
![]()
http://www.rsok.com/~jrm/printprimes.html
Using this technique you can store primes for every 210 numbers in 6 bytes. I have a file taking 285,714,288 bytes with the first 455052515 primes up to 210*285714288/6 = 10,000,000,080. The first 4 primes 2,3,5,7 is not stored in the file. You can read the file like this (long long is 64bit integer): Code:
#include <stdio.h> #include <stdlib.h> void main () { long long a,b,c,d,e,f,i,prime; long long * primes; FILE *file; int pri[] = {0,1,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,121,127,131,137,139, 143,149,151,157,163,167,169,173,179,181,187,191,193,197,199,209}; primes = (long long*) malloc (455052600*sizeof(long long)); f=0; file = fopen("e:\\primes.dat","rb"); primes[1]=2; primes[2]=3; primes[3]=5; primes[4]=7; prime=4; for(a=0;a<(10000000080/210);a++) { for(b=0;b<6;b++) { d=fgetc(file); e=256; for(c=1;c<=8;c++) { e=e/2; if (d>=e) { d=d-e; prime++; primes[prime]=f+pri[b*8+c]; } } } f=f+210; } } |
![]() |
![]() |
![]() |
#6 |
Apr 2012
5 Posts |
![]()
I was indeed looking for the 'best' trade-off between loading speed and storage size.
My method as the advantage to do very few ops (some pointer increasing, an addition and a times2 operation) to calculate primes. For primes up to 2^32, the only two ways to do less ops is 1)Full store the primes (this require 800M instead of 200M) 2)Remove the *2 operation and store unsigned short gaps (400M) The ATH approach is also interesting, i will take a look. It requires lot of more ops, but i think that the weak point of loading primes is to open the file and read, so some more ops are welcome. There is also some issue on compressing: if i .7z my .dat I can compress to about 50%, since I usually do not have high gaps. I think this can be done also for the ATH .dat file (you will have lots of 0 in your .dat file). I realize that i have used the same ATH idea, but only for mod%2. Moreover ATH does not use gaps, so we can combine the two methods: here some ideas: 1)We can use a larger sieve: up to 13# (30030 is still a short). We can store this sieve in a sieve.dat file 2)We can store each prime with an average of 4bits (half byte). We will use the value 0-14 for small gaps, and the value 15 for higher gaps (we will read another (or more) half-byte in this case). I will take a look next week. PS: @ATH: you should remove the first 0 from the sieve; also the first prime in c/c++ is primes[0]. Also i suggest you to use some more memory and load the .dat file at once, using the fread function once rather than 450MTimes fgetc. You can delete [] this array once you have finished to calculate primes. |
![]() |
![]() |
![]() |
#7 |
Mar 2010
26·3 Posts |
![]()
This is exactly what I did to pack prime numbers from 2 to 2^32. The received file, I called if "dtst", has a size 203316597 bytes. I used it for my "Number Theory Calculator", which can be downloaded from http://www.literka.addr.com/rchelp/installthcalc.exe. It can be used for factorizations, but with this respect, program is not competitive with existing programs.
I have updates of this program, but I did not upload them. Last fiddled with by literka on 2012-05-12 at 21:20 |
![]() |
![]() |
![]() |
#8 | |
Einyen
Dec 2003
Denmark
1101000010102 Posts |
![]() Quote:
I know arrays start at index 0 but if I put primes in an array I prefer to start at 1 so the n'th prime is at index n. Alternately you can create a RAM disk and put the primes.dat on it (it's just 285 Mb) and just access the primes you need directly with fseek. If you just need primes up to 2^32 it will take only round(2^32*6/210) + 1 = 122,713,352 bytes. Last fiddled with by ATH on 2012-05-13 at 04:10 |
|
![]() |
![]() |
![]() |
#9 |
Einyen
Dec 2003
Denmark
2·1,669 Posts |
![]()
This will take (2-1)*(3-1)*(5-1)*(7-1)*(11-1)*(13-1) = 5760 bits = 720 bytes for each chunk of 30030 numbers. It will save 1-(720/30030)/(6/210) ~ 16% space compared to the 210 in 6 bytes method but it will require a sieve up to 30030 instead of 210 to calculate the primes, not sure that is worth it for 16% less space but yes we can store the sieve in a file as well.
Last fiddled with by ATH on 2012-05-13 at 04:27 |
![]() |
![]() |
![]() |
#10 |
Jan 2008
France
56810 Posts |
![]()
I'm wondering if it is really worth the pain storing primes in a file given how fast a carefully implemented sieve can be.
|
![]() |
![]() |
![]() |
#11 |
Apr 2012
5 Posts |
![]()
The issue is not only about hd storage. Suppose you want to find all primes (say) from 10000000000G To 10000000050G (unsigned int64). You need somewhere all primes up to 2^32. And you may want to have a good trade-off between memory sotrage and computation efficency.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Prime gaps | Terence Schraut | Miscellaneous Math | 10 | 2020-09-01 23:49 |
Dr Nicely's 1st occurrence Prime Gaps | rudy235 | Prime Gap Searches | 192 | 2020-02-06 09:48 |
Welcome to the Prime Gaps Search Forum | robert44444uk | Prime Gap Searches | 2 | 2019-09-23 01:00 |
Gaps and more gaps on <300 site | gd_barnes | Riesel Prime Search | 11 | 2007-06-27 04:12 |
A Lot Of Storage | moo | Hardware | 0 | 2005-11-23 03:16 |