mersenneforum.org Prime representing constant program for calculation of the digits
 Register FAQ Search Today's Posts Mark Forums Read

 2021-01-28, 23:57 #1 Viliam Furik     "Viliam Furík" Jul 2018 Martin, Slovakia 10111010012 Posts Prime representing constant program for calculation of the digits Hello, not so long ago I have posted in the y-cruncher subforum about a program for the Prime representing constant (PRC). Since then I have developed quite a fast program, thanks to the help of casevh, author of gmpy2 library, and successfully computed 1 billion (109) digits of PRC. I have also finally managed to figure out how to parallelize the algorithm, but that led me to another problem... Next big milestone after 109 digits is 1012. That requires 1 TB of disk space to contain all digits in a text file (or less if any compression is used), computing through primes up to about 2,5 * 1012, basically the same amount of RAM (that problem has a solution - writing the intermediary numbers to disk, and reading them from there, as y-cruncher does, or at least how I understand it does it), but most importantly, it needs to be able to sieve out the primes themselves. Storing 2,5 * 1012 bits in RAM would require about 300 GB of RAM space, which is too much for what I seek. Sieving them in a file would bring up a lot of hassle with making a procedure that could do such thing and another that can read the actual values from there. These might not be extremely hard problems, but there may be easier ways to do this. So I would like to ask, whether anyone can help with some idea on how to effectively handle such gigantic sized numbers and amounts of data. BTW, I am also thinking about making it into a GPU program, it could play nicely with TF high-performing GPUs (basically INT32), as the primes are not that big, compared to the blob of prime jelly that is being computed. If somebody has some ideas about this too, I will be grateful for any advice. BTW2: Are there any volunteers who would like to test the upcoming C version of the code? It will have the multi-core computation implemented in it. Last fiddled with by Viliam Furik on 2021-01-29 at 12:49
2021-01-29, 09:53   #2
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

2×5×83 Posts

What are the requirements for the way you want to store the primes ((either) in RAM or on disk?). If none, you should be able to cut the memory requirement roughly in half, maybe even more. There are nearly 91 billion primes up to 2.5$$\cdot$$1012. If you would store the differences between primes like this instead of their exact values, it will save a lot of memory, but you can only access a single item by reading all values before it, so you would have random access in O(n). One could optimize this a lot by having intermediate values every so often, which might slightly increase memory demands, but will improve random access greatly.

Quote:
 Originally Posted by Viliam Furik BTW2: Are there any volunteers who would like to test the upcoming C version of the code? It will have the multi-core computation implemented in it.
Yes.

2021-01-29, 11:16   #3
Batalov

"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

22×32×269 Posts

Quote:
 Originally Posted by Viliam Furik Hello, not so long ago I have posted in the y-cruncher subforum about a program for the Prime generating constant (PGC). Since then I have developed quite a fast program, thanks to the help of casevh, author of gmpy2 library, and successfully computed 1 billion (109) digits of PGC.
...but then you can generate a 1 billion digits prime, can you?

Which 'Prime generating constant' are you calculating? Mills'?

Or are you calculating Euler–Mascheroni gamma?

 2021-01-29, 11:44 #4 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 33E16 Posts In the last thread he mentioned (this one), he wanted to compute the constant described in this paper.
 2021-01-29, 12:24 #5 Dr Sardonicus     Feb 2017 Nowhere 19·281 Posts Yes, I believe the original "Prime representing constant" was a better word choice.
2021-01-29, 12:48   #6
Viliam Furik

"Viliam Furík"
Jul 2018
Martin, Slovakia

10111010012 Posts

Quote:
 Originally Posted by Dr Sardonicus Yes, I believe the original "Prime representing constant" was a better word choice.
Oopsie... My bad, I confused the name. Will correct it.

 2021-01-29, 13:13 #7 Viliam Furik     "Viliam Furík" Jul 2018 Martin, Slovakia 5×149 Posts @kruoli: As of now, the program uses a bit array for the primes, which it iterates. The parallelized algorithm relies on some way to store the primes, and then it chops the list of primes into about equal pieces (whether by the length of the range, or prime count in that range - about the same for these quantities of primes), saves the actual value of the last prime in each range (one range per worker), and when the actual values are needed in the computation, it calculates the from the position in the range, range number (used for finding the last prime in previous range) and if running from a savefile, also the last prime used in the computation of the savefile. BTW, I forgot to mention, that I need a way to efficiently sieve them, too. Sieving is done quite easily in RAM, but sieving on disk is a bit more complicated, albeit not too complicated. To be clear, I am not thinking about using other methods than Eratosthenes (unless they are more efficient), but the storage of bits representing the integers is the tricky part when you need a lot of them. Storing only the prime gaps seems to be a lot better, but is there a better way to store the sieving data? @Batalov: As kruoli kindly explained, I am calculating the Prime representing constant, which can only generate (thus my confusion in the name) primes which were used in its computation, using the recursive formula. So 1 billion digits of the constant only suffice for primes under about 2 302 586 000. Using the formula to generate the primes back is perhaps the only way to confirm the correctness of the digits, apart from computing them with some different method (there is another formula, but it is equivalent to the one mentioned in the paper - it is basically the same, only the one in the paper has its terms grouped, thus has fewer terms for the same amount of digits, but the terms take longer to compute, so it equals out).
 2021-01-29, 13:57 #8 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 11001111102 Posts Sieving itself should not be a problem. You should use primesieve if you are not eager to implement everything yourself: Code: $/usr/bin/time -v primesieve 2500000000000 Sieve size = 256 KiB Threads = 16 100% Seconds: 62.344 Primes: 90882915772 Command being timed: "primesieve 2500000000000" User time (seconds): 992.74 System time (seconds): 0.12 Percent of CPU this job got: 1592% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:02.35 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 47436 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 6 Minor (reclaiming a frame) page faults: 15683 Voluntary context switches: 210 Involuntary context switches: 4092 Swaps: 0 File system inputs: 928 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 That's not even 50 MB. The space requirement for that command was in $$\mathcal{O}(\sqrt n)$$. If you use the API of primesieve and do execute a command that is not only counting, it will of course take a bit longer, but not in a way that would cause delays. So my next question would be: Do you really need to store the primes? According to your last thread and the paper, you will only use one prime at a time, in order. Only in the division step in the end, you use the primorial up to your largest used prime, and you do not need all primes stored for that. For an efficient implementation, you'd have to store in the order of [$]\mathcal{O}(\log p_\text{count})[/\$] primes at any given time. The problem would be the result (divisor) getting large quickly. Last fiddled with by kruoli on 2021-01-29 at 14:44 Reason: Being pedantic. Removed something I'm not yet sure about.
2021-01-29, 15:55   #9
Viliam Furik

"Viliam Furík"
Jul 2018
Martin, Slovakia

5×149 Posts

Quote:
 Originally Posted by kruoli So my next question would be: Do you really need to store the primes?
I don't think I can do it without storing them. Primorial is the product of all the primes used. In the computation of the numerator, each prime is used once, that's true. I have looked at the primesieve_next_prime function, but I don't think it can be faster to compute the primes instead of storing some sort of a list of them.

2021-01-29, 16:18   #10
Dr Sardonicus

Feb 2017
Nowhere

10100110110112 Posts

Quote:
 Originally Posted by Viliam Furik Oopsie... My bad, I confused the name. Will correct it.

Blogging, as it should be!

 2021-02-12, 19:22 #11 Viliam Furik     "Viliam Furík" Jul 2018 Martin, Slovakia 13518 Posts Quick question, does GMP, when using C on Windows, have to be compiled or something? It seems to me it does, as there is only a source code available on the website. I assume the answer will be the same for the primesieve library.

 Similar Threads Thread Thread Starter Forum Replies Last Post Viliam Furik y-cruncher 13 2021-01-14 09:43 petrw1 Marin's Mersenne-aries 8 2020-09-13 16:07 Citrix Open Projects 18 2013-08-24 05:24 Stargate38 Factoring 24 2011-11-03 00:34 Orgasmic Troll Math 28 2004-12-02 16:37

All times are UTC. The time now is 23:15.

Mon Jan 17 23:15:23 UTC 2022 up 178 days, 17:44, 0 users, load averages: 1.11, 1.24, 1.32