mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   FermatSearch (https://www.mersenneforum.org/forumdisplay.php?f=133)
-   -   PRP for F33? (https://www.mersenneforum.org/showthread.php?t=26030)

gLauss 2020-10-02 18:26

PRP for F33?
 
Hi,
first of all, sorry if this is a noob question. I am wondering why noone has proven the (non-)primality of F33 with PRP testing. It will be a huge task, but should be doable, shouldn't it?
[LIST][*]F33 is as large as M8.589.934.591, or in the 8G range.[*]Someone has done a PRP Test for [URL="https://www.mersenne.ca/exponent/999999937"]a 1G exponent[/URL][*]mersenne.ca reports [URL="https://www.mersenne.ca/credit.php?f_exponent=30&worktype=LL"]62.000 GHzd[/URL] for F30[*]So a F33 PRP test will be around 8 times this, i.e. around half a million GHz Days.[*]An 8G number should easily fit in the RAM of a powerful computer, too.[/LIST]So why is noone attempting this task? Of course it would require something like an Amazon c5.metal instance, but if I would be Ben Delo and pushing 50 million GHz Days in the last year, then I would use 1% of this in order to prove the compositeness of F33...

ATH 2020-10-02 19:05

It is ~8.6 times as many iterations as the 1G exponent, but it would require a much much larger FFT size, so each iteration would take much much longer. Not sure how much longer but total running time is >50 times longer than the 1G exponent and probably a lot more.

Another huge problem is that there are currently no programs that can do a large enough FFT size for a this number. I think the largest currently is CUDALucas 65536K (64M) FFT which is "only" enough for ~1.14G exponents.

VBCurtis 2020-10-02 21:05

[QUOTE=gLauss;558648][*]So a F33 PRP test will be around 8 times this, i.e. around half a million GHz Days.[/LIST][/QUOTE]
This line is where you fail. FFT size rises with the size of the exponent, mostly linearly. So, you have a quadratic increase in testing time: ~65 times as long a task as F30.

Once Ernst or Mihai or George extend the software to handle it, someone may well take it on- but at 8-9x the time estimate you gave, you should see that it's not obvious that it needs doing on current hardware.

mathwiz 2020-10-02 23:28

[QUOTE=VBCurtis;558665]This line is where you fail. FFT size rises with the size of the exponent, mostly linearly. So, you have a quadratic increase in testing time: ~65 times as long a task as F30.

Once Ernst or Mihai or George extend the software to handle it, someone may well take it on- but at 8-9x the time estimate you gave, you should see that it's not obvious that it needs doing on current hardware.[/QUOTE]

Is there a way we could do a back-of-the-envelope calculation: assuming GpuOwl supported Fermat numbers, what FFT size would be required, and how long would a run take on V100/A100/RTX3090 etc?

VBCurtis 2020-10-03 00:26

Sure- double the exponent, double the FFT size. Then add a little bit more, say 10% extra per doubling.

forumite ewmeyer, the developer of cudalucas and mlucas, has done the largest Fermat test that I know of. See [url]https://mersenneforum.org/showthread.php?t=21544[/url] for details of his efforts to run F30.

I believe he used 32768K FFT size for F30, so something like 262144K or whatever the next step larger would be sufficient to test F33.

F30 took a really long time, though- I'm not seeing F33 as a reasonable proposition until GPUs are an order of magnitude faster.

ATH 2020-10-03 03:37

[QUOTE=ATH;558656]I think the largest currently is CUDALucas 65536K (64M) FFT which is "only" enough for ~1.14G exponents.[/QUOTE]

Actually Prime95 also have 64M AVX512 FFT, and it can probably go a bit higher than 1.14G exponents I would guess, have not tested.
I have not checked FMA FFT recently in Prime95, but it goes to at least 50M.

LaurV 2020-10-03 05:07

[QUOTE=VBCurtis;558685]forumite ewmeyer, the developer of cudalucas and mlucas, has done the largest Fermat test that I know of.[/QUOTE]
Small correction, forumite msft made cudaLucas, the first GPU-enabled LL test ever, using Nvidia cuFFT libraries, albeit at the time was not called cudaLucas. Other people contributed to it and improved it a lot, later, as well as renamed it to the actual name, but Ernst had not much of a hand in it. We however agree about the other two things - Ernst being the "daddy" of mlucas, which brings LL test to other non-x86 architectures, and he being the one testing the largest fermat number "[URL="https://en.wikipedia.org/wiki/P%C3%A9pin%27s_test"]pepinned [/URL]to the ground" today. (Ha! I think I coined a new term! :razz:)

gLauss 2020-10-03 09:39

Thanks for the answers. I now understand why it wasn't done before:
[LIST=1][*]It is actually a few million Ghz-Days of work.[*]There is no software which can deal with this number yet.[*]While a few million GHz-Days of computation are done by large GIMPS contributors, the problem is that all of this is a single big computation and you cannot simply run it on N different computers in parallel. The best iteration times back in 2017 were around 440ms, see [URL="https://mersenneforum.org/showpost.php?p=453841&postcount=287"]this post from the thread linked above.[/URL] So the calculation would actually take around 8G*0.44s or around 120 years which is longer than the lifetime of a human. Even if we factor in that one might switch the computation to the newest generation of CPU/GPU every few years, it still would require at least a few decades of effort and quite a big investment.[/LIST]

ATH 2020-10-03 14:01

[QUOTE=gLauss;558726]The best iteration times back in 2017 were around 440ms, see [URL="https://mersenneforum.org/showpost.php?p=453841&postcount=287"]this post from the thread linked above.[/URL][/QUOTE]

Ok, so #2 on the list is not valid then, there is a software that can handle it: Mlucas.

Since Ernst got an iteration time on F33, Mlucas must have big enough FFT to handle it, and now that I check I can see it goes to at least 240M FFT (245760K) in an old self-test I did on Mlucas.

But the 2 other points are still valid, it takes way too long and it cannot be distributed to many computers.

alpertron 2020-10-03 18:15

Before even thinking on performing the primality check of F33, a lot more trial factoring is needed and then someone has to run P-1.

firejuggler 2020-10-03 18:38

recently , I factored (PM1)a 535 M exponent with PM1, the FFT lenght was "fft-length":31457280 aka 30720k.
[code]
[Tue Sep 1 02:03:45 2020]
FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=4 (3 cores, 1 worker): 67.53 ms. Throughput: 14.81 iter/sec.
FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=2 (3 cores, 1 worker): 66.36 ms. Throughput: 15.07 iter/sec.
FFTlen=30720K, Type=3, Arch=4, Pass1=1536, Pass2=20480, clm=1 (3 cores, 1 worker): 75.09 ms. Throughput: 13.32 iter/sec.
FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=4 (3 cores, 1 worker): 64.08 ms. Throughput: 15.61 iter/sec.
FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=2 (3 cores, 1 worker): 64.12 ms. Throughput: 15.60 iter/sec.
FFTlen=30720K, Type=3, Arch=4, Pass1=2048, Pass2=15360, clm=1 (3 cores, 1 worker): 66.83 ms. Throughput: 14.96 iter/sec.
[/code]
so, if the exponent is 16 time higher, I guess... 1 sec by iteration? (that was on Prime95)


All times are UTC. The time now is 07:15.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.