![]() |
llrp, a portable version of LLR using FFTW
Hi All,
I just released llrp Version 3.8.0, a portable version of LLR 3.8.4 using FFTW. It uses a portable adaptation of George Woltman's gwnum library, but, at the contrary of llrpi which is stand alone, it uses Matteo Frigo and Steven G. Johnson's FFTW library, Version 3.2.2 The main advantage of using this library is that a power of two FFT length is no more required, which may sometimes gain a factor of two in speed! You may download the source and the x86 binaries from my development directory : [url]http://jpenne.free.fr/Development/llrp380devsrc.zip[/url] is the compressed source directory. [url]http://jpenne.free.fr/Development/llrp380dev.zip[/url] is the Windows (console) executable. [url]http://jpenne.free.fr/Development/llrp380devlinux.zip[/url] is the Linux one. [url]http://jpenne.free.fr/Development/llrp380devslinux.zip[/url] is the static Linux one. Indeed, this program is much slower than LLR 3.8.4 when run on x86 platforms, but be not disappointed! George's assembler code is so optimized that I think it could hardly be surpassed on these machines... But I think that this work can help for several other tasks : - It allows to verify positive results on non-x86 machines. - It would not be too difficult (I hope) to implement it on new hardwares such as GPU's, and I think it could help CUDA developpers. Best Regards, Jean |
How difficult would it be to modify to use YEAFFT? YEAFFT is faster than FFTW.
|
YEAFFT usage for LLR
If I can pass an array of doubles to this library, do a real or complex FFT transform on it, get the result in one or two array(s) of doubles... If I can do the same things for the inverse FFT transforms, it would not be difficult to use YEAFFT.
Unfortunately it seems to me that it is not so easy... Am I wrong ? Jean |
[QUOTE=Jean Penné;260846]If I can pass an array of doubles to this library, do a real or complex FFT transform on it, get the result in one or two array(s) of doubles... If I can do the same things for the inverse FFT transforms, it would not be difficult to use YEAFFT.
Unfortunately it seems to me that it is not so easy... Am I wrong ? Jean[/QUOTE] That is a question for Phil Carmody, the originator of phrot, which uses YEAFFT. |
[QUOTE=rogue;260853]That is a question for Phil Carmody, the originator of phrot, which uses YEAFFT.[/QUOTE]
The main advantage of phrot is that it can efficiently test non base two numbers ; I wish to add this feature in my code.... AFAIK, YEAFFT is not really an FFT library, but rather, a part of Glucas internals... It does essentially convolutions products of arrays of doubles, using real direct and inverse FFT transforms, and benefits of IBDWT only in the case of squarings (multiplications ?) modulo a Mersenne number. In short, it is a portable version of George Woltman's gwnum library versions written before year 2004, without the restriction on base two numbers, but with multiplications intricated in its code... What I need is a library that do real and complex direct and inverse FFT transforms, and let me do the dyadic multiplications! Mark, Phil, let me know if I am wrong! Jean |
[QUOTE=Jean Penné;260885]AFAIK, YEAFFT is not really an FFT library, but rather, a part of Glucas internals...
[/quote] Yes. Unfortunately it has not been kept up to date. I don't know what happened to Guillermo. I presume that he lost interest in it years ago. It does support multiplications in addition to squarings. AFAIU, it does not support any complex FFTs. If it could do some of these things (and a few more), I would have attempted to build PFGW with it. |
[QUOTE=rogue;260904]Yes.
Unfortunately it has not been kept up to date. I don't know what happened to Guillermo. I presume that he lost interest in it years ago. It does support multiplications in addition to squarings. AFAIU, it does not support any complex FFTs. If it could do some of these things (and a few more), I would have attempted to build PFGW with it.[/QUOTE] Indeed! Such updates would be really good news, and I am fully agreeing with you now! Jean |
Bug: segfaults when testing 1525*2^700391-1, before starting LL loop.
Seems to fail for every large-k/large-n combination. Nugget:smile: |
[QUOTE=nuggetprime;261552]Bug: segfaults when testing 1525*2^700391-1, before starting LL loop.
Seems to fail for every large-k/large-n combination. Nugget:smile:[/QUOTE] What system are you using? I tested your candidate on my Linux Kubuntu system and had no problem : ~/llrp380devsrc/linuxllr $ ./llrp -a2 -oVerbose=1 -d -q"1525*2^700391-1" Starting Lucas Lehmer Riesel prime test of 1525*2^700391-1 Using zero-padded rational base DWT, FFT length = 86016 V1 = 4 ; Computing U0...done. 1525*2^700391-1 is prime! Time : 10193.227 sec. I am running the same test on Windows 2000 and got no problems until now, but it is three times slower, so only 27% done at this time... Now completed, an no more problem : I:\D\Jean\llrp380dev>llrp -a4 -oVerbose=1 -d -q"1525*2^700391-1" Starting Lucas Lehmer Riesel prime test of 1525*2^700391-1 Using zero-padded rational base DWT, FFT length = 86016 V1 = 4 ; Computing U0...done. 1525*2^700391-1 is prime! Time : 32613.687 sec. Regards, Jean |
[QUOTE]
./gwpnum/gwpnum.c: fwpx = fftw_plan_r2r_1d (n, xin, xout, FFTW_R2HC, FFTW_MEASURE); [/QUOTE] CUFFT with CUDA 4.0 Not support Halfcomplex-format DFT. |
performance comments
A couple of suggestions regarding performance:
1) Try the r2c interface rather than the r2r interface for real-input FFTs. The r2c interface usually gives better performance. In particular, r2c exploits SIMD instructions (e.g. SSE2) whereas r2r does not. (For example, on my x86-64 Xeon for a size-16k transform, r2c is about 60% faster than r2r. I'm assuming you've compiled FFTW with --enable-sse2. The difference gets even larger with FFTW 3.3 on new machines with AVX instructions.) 2) Try FFTW_PATIENT rather than FFTW_MEASURE. For smallish transforms, it doesn't make much difference, but for > 32k it can make a difference, and may be worth it if you are doing lots of of transforms of the same size. Regards, Steven G. Johnson |
| All times are UTC. The time now is 19:38. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.