mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

prime7989 2012-09-01 03:37

How to modify CUDALucas slightly
 
[QUOTE=flashjh;309768][INDENT]Go [URL="http://sourceforge.net/projects/cudalucas/files/2.04%20Beta/"]here[/URL]. The lastest build is 28 Aug 2012
[/INDENT]Agree, and many others! I just make it compile on Windows :smile:[/QUOTE]
How and where in code can one modify CUDALucas sources to accept fermat numbers
with Fn=2^2^n+1 and x[0]=5.0 with everything else being the same as the original CUDALucas?
Change input: from 2^p-1 to 2^2^n+1
Change the start of S_0 = 5.0 keeping the recursive formula the same S_(k+1)=(S_k)^2-2
And change the number of iterations from p-2 to 2^n-2
Changing the modulus from 2^p-1 to Fn=2^2^n+1
Again as before checking that S_(2^n-2)==0(mod Fn)
instead of S_(p-2)==0(mod Mp) where Mp=2^p-1
Can anyone post the please modify the CUDALucas code and post it?
I can try to check the code.

Dubslow 2012-09-01 04:15

[QUOTE=prime7989;309895]How and where in code can one modify CUDALucas sources to accept fermat numbers
with Fn=2^2^n+1 and x[0]=5.0 with everything else being the same as the original CUDALucas?
Change input: from 2^p-1 to 2^2^n+1
Change the start of S_0 = 5.0 keeping the recursive formula the same S_(k+1)=(S_k)^2-2
And change the number of iterations from p-2 to 2^n-2
Changing the modulus from 2^p-1 to Fn=2^2^n+1
Again as before checking that S_(2^n-2)==0(mod Fn)
instead of S_(p-2)==0(mod Mp) where Mp=2^p-1
Can anyone post the please modify the CUDALucas code and post it?
I can try to check the code.[/QUOTE]

I can do everything except the modulus part. I don't understand [URL="http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen_algorithm"]SS[/URL] and its implementation enough to do that. Try sending a private message to [URL="http://www.mersenneforum.org/private.php?do=newpm&u=9446"]msft[/URL], who does all the mathy things (he [URL="http://www.mersenneforum.org/showthread.php?t=12576"]created[/URL] CUDALucas).

prime7989 2012-09-01 05:13

[QUOTE=Dubslow;309899]I can do everything except the modulus part. I don't understand [URL="http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen_algorithm"]SS[/URL] and its implementation enough to do that. Try sending a private message to [URL="http://www.mersenneforum.org/private.php?do=newpm&u=9446"]msft[/URL], who does all the mathy things (he [URL="http://www.mersenneforum.org/showthread.php?t=12576"]created[/URL] CUDALucas).[/QUOTE]
Dear Dubslow,
SS is just the implementation of the FFT for modulus of forms 2^m+1 where m=2^n
Please do not worry about the SS and if you can , can you do implement what I suggested without SS and post it? I will check it and post rresults
Thanks!!!

LaurV 2012-09-01 07:23

That would be completely futile (sorry for playing RDS, sometimes I really miss him :razz:), from the memory limitation point of view. The Fermat numbers we don't know their primality status are already too big to be tested by this method, even with a Tesla K20 (assuming it will hit the market next year, or at the end of this year) you will need many of them...

msft 2012-09-01 08:45

Hi ,prime7989
You can use genefer.c , Fermat number is a subset of GFN.

ET_ 2012-09-01 14:22

[QUOTE=prime7989;309895]How and where in code can one modify CUDALucas sources to accept fermat numbers
with Fn=2^2^n+1 and x[0]=5.0 with everything else being the same as the original CUDALucas?
Change input: from 2^p-1 to 2^2^n+1
Change the start of S_0 = 5.0 keeping the recursive formula the same S_(k+1)=(S_k)^2-2
And change the number of iterations from p-2 to 2^n-2
Changing the modulus from 2^p-1 to Fn=2^2^n+1
Again as before checking that S_(2^n-2)==0(mod Fn)
instead of S_(p-2)==0(mod Mp) where Mp=2^p-1
Can anyone post the please modify the CUDALucas code and post it?
I can try to check the code.[/QUOTE]

I don't like to play the devil, but did you evaluate that F26 would be bigger in size than the actual exponents released by Primenet, and F30 would require more than one year of computation? Nearly all of them (those from F4 to F30) are already known as composites...

Luigi

prime7989 2012-09-01 16:39

[QUOTE=msft;309916]Hi ,prime7989
You can use genefer.c , Fermat number is a subset of GFN.[/QUOTE]
Hi, msft,
genefer.c is a probabilistic test and not deterministic.
Do you know of any Pepins of LLV type deterministic software?

LaurV 2012-09-01 16:39

@ET_: That was what I said in post #1566, but you put it into numbers :D

prime7989 2012-09-01 17:19

Faster Algorithms
 
[QUOTE=ET_;309934]I don't like to play the devil, but did you evaluate that F26 would be bigger in size than the actual exponents released by Primenet, and F30 would require more than one year of computation? Nearly all of them (those from F4 to F30) are already known as composites...

Luigi[/QUOTE]
Hi,msft,Luigi,Dubslow,
The goal is to find faster and faster mathematically correct deterministic algorithms for fermat and Mersennne numbers and other types of numbers so the existing software would be used as a benchmark for already known large fermart or mersenne numbers/primes. So if any one could point out in the cudalucas code where Mp=2^p-1 is calculated and modules by Mp is done i would be gratefull. Code snippets would be useful. Ofcourse I do not know much about the IBDWT
Thank you! Ofcourse the lower bound being Super log time and upper bound a poly based AKS algorithm.

Dubslow 2012-09-01 17:47

[QUOTE=prime7989;309949]Hi,msft,Luigi,Dubslow,
The goal is to find faster and faster mathematically correct deterministic algorithms for fermat and Mersennne numbers and other types of numbers so the existing software would be used as a benchmark for already known large fermart or mersenne numbers/primes. So if any one could point out in the cudalucas code where Mp=2^p-1 is calculated and modules by Mp is done i would be gratefull. Code snippets would be useful. Ofcourse I do not know much about the IBDWT
Thank you! Ofcourse the lower bound being Super log time and upper bound a poly based AKS algorithm.[/QUOTE]

The FFT is done by the CuFFT library, meaning none of the code is actually for the FFTs. If you go [URL="http://sourceforge.net/p/cudalucas/code/38/tree/trunk/CUDALucas.cu?force=True"]here[/URL] and Ctrl+F for "lucas_square (d" you can find the code that runs the test, and see the GPU kernel calls. rdft() does the FFT, and the nomarlize* kernels are the part of the SS algorithm that happens after the FFTs. (Yes, this does use the SS algorithm. It's the fastest practical multiplication algorithm.)

ET_ 2012-09-01 18:55

[QUOTE=prime7989;309949]Hi,msft,Luigi,Dubslow,
The goal is to find faster and faster mathematically correct deterministic algorithms for fermat and Mersennne numbers and other types of numbers so the existing software would be used as a benchmark for already known large fermart or mersenne numbers/primes. So if any one could point out in the cudalucas code where Mp=2^p-1 is calculated and modules by Mp is done i would be gratefull. Code snippets would be useful. Ofcourse I do not know much about the IBDWT
Thank you! Ofcourse the lower bound being Super log time and upper bound a poly based AKS algorithm.[/QUOTE]

To help you, we need to know how you plan to work.
Are you going to look for a better algorithm, or just trying to optimise the existent one?
In the first case, there is no known CODED algorithm that actually compares to LL.
In the second case, you have no chance (as LaurV pointed out), because the kernel of LL algorithm is already coded into the library.

Finally, if you just want to exchange the Mersenne test with a Fermat test, be prepared to fail, because as soon as the test approaches F31, the program will burst into pieces for lack of physical memory...

Luigi


All times are UTC. The time now is 23:15.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.