mersenneforum.org PRP for F33?
 Register FAQ Search Today's Posts Mark Forums Read

 2020-10-02, 18:26 #1 gLauss   Nov 2014 2×7 Posts PRP for F33? Hi, first of all, sorry if this is a noob question. I am wondering why noone has proven the (non-)primality of F33 with PRP testing. It will be a huge task, but should be doable, shouldn't it? F33 is as large as M8.589.934.591, or in the 8G range. Someone has done a PRP Test for a 1G exponent mersenne.ca reports 62.000 GHzd for F30 So a F33 PRP test will be around 8 times this, i.e. around half a million GHz Days. An 8G number should easily fit in the RAM of a powerful computer, too. So why is noone attempting this task? Of course it would require something like an Amazon c5.metal instance, but if I would be Ben Delo and pushing 50 million GHz Days in the last year, then I would use 1% of this in order to prove the compositeness of F33... Last fiddled with by gLauss on 2020-10-02 at 18:38
 2020-10-02, 19:05 #2 ATH Einyen     Dec 2003 Denmark 3×5×199 Posts It is ~8.6 times as many iterations as the 1G exponent, but it would require a much much larger FFT size, so each iteration would take much much longer. Not sure how much longer but total running time is >50 times longer than the 1G exponent and probably a lot more. Another huge problem is that there are currently no programs that can do a large enough FFT size for a this number. I think the largest currently is CUDALucas 65536K (64M) FFT which is "only" enough for ~1.14G exponents. Last fiddled with by ATH on 2020-10-02 at 19:08
2020-10-02, 21:05   #3
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

119716 Posts

Quote:
 Originally Posted by gLauss [*]So a F33 PRP test will be around 8 times this, i.e. around half a million GHz Days.[/LIST]
This line is where you fail. FFT size rises with the size of the exponent, mostly linearly. So, you have a quadratic increase in testing time: ~65 times as long a task as F30.

Once Ernst or Mihai or George extend the software to handle it, someone may well take it on- but at 8-9x the time estimate you gave, you should see that it's not obvious that it needs doing on current hardware.

2020-10-02, 23:28   #4
mathwiz

Mar 2019

3·43 Posts

Quote:
 Originally Posted by VBCurtis This line is where you fail. FFT size rises with the size of the exponent, mostly linearly. So, you have a quadratic increase in testing time: ~65 times as long a task as F30. Once Ernst or Mihai or George extend the software to handle it, someone may well take it on- but at 8-9x the time estimate you gave, you should see that it's not obvious that it needs doing on current hardware.
Is there a way we could do a back-of-the-envelope calculation: assuming GpuOwl supported Fermat numbers, what FFT size would be required, and how long would a run take on V100/A100/RTX3090 etc?

 2020-10-03, 00:26 #5 VBCurtis     "Curtis" Feb 2005 Riverside, CA 3·19·79 Posts Sure- double the exponent, double the FFT size. Then add a little bit more, say 10% extra per doubling. forumite ewmeyer, the developer of cudalucas and mlucas, has done the largest Fermat test that I know of. See https://mersenneforum.org/showthread.php?t=21544 for details of his efforts to run F30. I believe he used 32768K FFT size for F30, so something like 262144K or whatever the next step larger would be sufficient to test F33. F30 took a really long time, though- I'm not seeing F33 as a reasonable proposition until GPUs are an order of magnitude faster.
2020-10-03, 03:37   #6
ATH
Einyen

Dec 2003
Denmark

3·5·199 Posts

Quote:
 Originally Posted by ATH I think the largest currently is CUDALucas 65536K (64M) FFT which is "only" enough for ~1.14G exponents.
Actually Prime95 also have 64M AVX512 FFT, and it can probably go a bit higher than 1.14G exponents I would guess, have not tested.
I have not checked FMA FFT recently in Prime95, but it goes to at least 50M.

2020-10-03, 05:07   #7
LaurV
Romulan Interpreter

Jun 2011
Thailand

230316 Posts

Quote:
 Originally Posted by VBCurtis forumite ewmeyer, the developer of cudalucas and mlucas, has done the largest Fermat test that I know of.
Small correction, forumite msft made cudaLucas, the first GPU-enabled LL test ever, using Nvidia cuFFT libraries, albeit at the time was not called cudaLucas. Other people contributed to it and improved it a lot, later, as well as renamed it to the actual name, but Ernst had not much of a hand in it. We however agree about the other two things - Ernst being the "daddy" of mlucas, which brings LL test to other non-x86 architectures, and he being the one testing the largest fermat number "pepinned to the ground" today. (Ha! I think I coined a new term! )

Last fiddled with by LaurV on 2020-10-03 at 05:08

 2020-10-03, 09:39 #8 gLauss   Nov 2014 1410 Posts Thanks for the answers. I now understand why it wasn't done before: It is actually a few million Ghz-Days of work. There is no software which can deal with this number yet. While a few million GHz-Days of computation are done by large GIMPS contributors, the problem is that all of this is a single big computation and you cannot simply run it on N different computers in parallel. The best iteration times back in 2017 were around 440ms, see this post from the thread linked above. So the calculation would actually take around 8G*0.44s or around 120 years which is longer than the lifetime of a human. Even if we factor in that one might switch the computation to the newest generation of CPU/GPU every few years, it still would require at least a few decades of effort and quite a big investment.
2020-10-03, 14:01   #9
ATH
Einyen

Dec 2003
Denmark

3·5·199 Posts

Quote:
 Originally Posted by gLauss The best iteration times back in 2017 were around 440ms, see this post from the thread linked above.
Ok, so #2 on the list is not valid then, there is a software that can handle it: Mlucas.

Since Ernst got an iteration time on F33, Mlucas must have big enough FFT to handle it, and now that I check I can see it goes to at least 240M FFT (245760K) in an old self-test I did on Mlucas.

But the 2 other points are still valid, it takes way too long and it cannot be distributed to many computers.

Last fiddled with by ATH on 2020-10-03 at 14:01

 2020-10-03, 18:15 #10 alpertron     Aug 2002 Buenos Aires, Argentina 53A16 Posts Before even thinking on performing the primality check of F33, a lot more trial factoring is needed and then someone has to run P-1.
 2020-10-03, 18:38 #11 firejuggler     Apr 2010 Over the rainbow 13×191 Posts recently , I factored (PM1)a 535 M exponent with PM1, the FFT lenght was "fft-length":31457280 aka 30720k. Code: [Tue Sep 1 02:03:45 2020] FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=4 (3 cores, 1 worker): 67.53 ms. Throughput: 14.81 iter/sec. FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=2 (3 cores, 1 worker): 66.36 ms. Throughput: 15.07 iter/sec. FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=1 (3 cores, 1 worker): 75.09 ms. Throughput: 13.32 iter/sec. FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=4 (3 cores, 1 worker): 64.08 ms. Throughput: 15.61 iter/sec. FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=2 (3 cores, 1 worker): 64.12 ms. Throughput: 15.60 iter/sec. FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=1 (3 cores, 1 worker): 66.83 ms. Throughput: 14.96 iter/sec. so, if the exponent is 16 time higher, I guess... 1 sec by iteration? (that was on Prime95)