mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2011-05-08, 13:15   #1
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

13·43 Posts
Default llrp, a portable version of LLR using FFTW

Hi All,

I just released llrp Version 3.8.0, a portable version of LLR 3.8.4 using FFTW.
It uses a portable adaptation of George Woltman's gwnum library, but, at
the contrary of llrpi which is stand alone, it uses Matteo Frigo and Steven G. Johnson's FFTW library, Version 3.2.2
The main advantage of using this library is that a power of two FFT length
is no more required, which may sometimes gain a factor of two in speed!

You may download the source and the x86 binaries from my development directory :

http://jpenne.free.fr/Development/llrp380devsrc.zip is the compressed source directory.
http://jpenne.free.fr/Development/llrp380dev.zip is the Windows (console) executable.
http://jpenne.free.fr/Development/llrp380devlinux.zip is the Linux one.
http://jpenne.free.fr/Development/llrp380devslinux.zip is the static Linux one.

Indeed, this program is much slower than LLR 3.8.4 when run on x86 platforms, but be not disappointed! George's assembler code is so optimized that I think it could hardly be surpassed on these machines...
But I think that this work can help for several other tasks :

- It allows to verify positive results on non-x86 machines.
- It would not be too difficult (I hope) to implement it on new hardwares such as GPU's, and I think it could help CUDA developpers.

Best Regards,
Jean
Jean Penné is offline   Reply With Quote
Old 2011-05-08, 15:48   #2
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10110101110002 Posts
Default

How difficult would it be to modify to use YEAFFT? YEAFFT is faster than FFTW.
rogue is offline   Reply With Quote
Old 2011-05-08, 19:12   #3
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

13·43 Posts
Default YEAFFT usage for LLR

If I can pass an array of doubles to this library, do a real or complex FFT transform on it, get the result in one or two array(s) of doubles... If I can do the same things for the inverse FFT transforms, it would not be difficult to use YEAFFT.
Unfortunately it seems to me that it is not so easy...
Am I wrong ?
Jean
Jean Penné is offline   Reply With Quote
Old 2011-05-08, 20:16   #4
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

23×727 Posts
Default

Quote:
Originally Posted by Jean Penné View Post
If I can pass an array of doubles to this library, do a real or complex FFT transform on it, get the result in one or two array(s) of doubles... If I can do the same things for the inverse FFT transforms, it would not be difficult to use YEAFFT.
Unfortunately it seems to me that it is not so easy...
Am I wrong ?
Jean
That is a question for Phil Carmody, the originator of phrot, which uses YEAFFT.
rogue is offline   Reply With Quote
Old 2011-05-09, 06:33   #5
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

22F16 Posts
Default

Quote:
Originally Posted by rogue View Post
That is a question for Phil Carmody, the originator of phrot, which uses YEAFFT.
The main advantage of phrot is that it can efficiently test non base two numbers ; I wish to add this feature in my code....

AFAIK, YEAFFT is not really an FFT library, but rather, a part of Glucas internals...
It does essentially convolutions products of arrays of doubles, using real direct and inverse FFT transforms, and benefits of IBDWT only in the case of squarings (multiplications ?) modulo a Mersenne number.
In short, it is a portable version of George Woltman's gwnum library versions written before year 2004, without the restriction on base two numbers, but with multiplications intricated in its code...

What I need is a library that do real and complex direct and inverse FFT transforms, and let me do the dyadic multiplications!

Mark, Phil, let me know if I am wrong!

Jean
Jean Penné is offline   Reply With Quote
Old 2011-05-09, 12:30   #6
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

23×727 Posts
Default

Quote:
Originally Posted by Jean Penné View Post
AFAIK, YEAFFT is not really an FFT library, but rather, a part of Glucas internals...
Yes.

Unfortunately it has not been kept up to date. I don't know what happened to Guillermo. I presume that he lost interest in it years ago. It does support multiplications in addition to squarings. AFAIU, it does not support any complex FFTs.

If it could do some of these things (and a few more), I would have attempted to build PFGW with it.

Last fiddled with by rogue on 2011-05-09 at 12:31
rogue is offline   Reply With Quote
Old 2011-05-09, 12:58   #7
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

22F16 Posts
Default

Quote:
Originally Posted by rogue View Post
Yes.

Unfortunately it has not been kept up to date. I don't know what happened to Guillermo. I presume that he lost interest in it years ago. It does support multiplications in addition to squarings. AFAIU, it does not support any complex FFTs.

If it could do some of these things (and a few more), I would have attempted to build PFGW with it.
Indeed! Such updates would be really good news, and I am fully agreeing with you now!

Jean
Jean Penné is offline   Reply With Quote
Old 2011-05-16, 18:52   #8
nuggetprime
 
nuggetprime's Avatar
 
Mar 2007
Austria

2·151 Posts
Default

Bug: segfaults when testing 1525*2^700391-1, before starting LL loop.
Seems to fail for every large-k/large-n combination.

Nugget
nuggetprime is offline   Reply With Quote
Old 2011-05-18, 05:21   #9
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

22F16 Posts
Default

Quote:
Originally Posted by nuggetprime View Post
Bug: segfaults when testing 1525*2^700391-1, before starting LL loop.
Seems to fail for every large-k/large-n combination.

Nugget
What system are you using?

I tested your candidate on my Linux Kubuntu system and had no problem :

~/llrp380devsrc/linuxllr $ ./llrp -a2 -oVerbose=1 -d -q"1525*2^700391-1"
Starting Lucas Lehmer Riesel prime test of 1525*2^700391-1
Using zero-padded rational base DWT, FFT length = 86016
V1 = 4 ; Computing U0...done.
1525*2^700391-1 is prime! Time : 10193.227 sec.

I am running the same test on Windows 2000 and got no problems until now, but it is three times slower, so only 27% done at this time...

Now completed, an no more problem :
I:\D\Jean\llrp380dev>llrp -a4 -oVerbose=1 -d -q"1525*2^700391-1"
Starting Lucas Lehmer Riesel prime test of 1525*2^700391-1
Using zero-padded rational base DWT, FFT length = 86016
V1 = 4 ; Computing U0...done.
1525*2^700391-1 is prime! Time : 32613.687 sec.

Regards,
Jean
Jean Penné is offline   Reply With Quote
Old 2011-06-30, 08:53   #10
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Quote:
./gwpnum/gwpnum.c: fwpx = fftw_plan_r2r_1d (n, xin, xout, FFTW_R2HC, FFTW_MEASURE);
CUFFT with CUDA 4.0 Not support Halfcomplex-format DFT.
msft is offline   Reply With Quote
Old 2011-07-01, 23:32   #11
stevenj
 

936310 Posts
Default performance comments

A couple of suggestions regarding performance:

1) Try the r2c interface rather than the r2r interface for real-input FFTs. The r2c interface usually gives better performance. In particular, r2c exploits SIMD instructions (e.g. SSE2) whereas r2r does not. (For example, on my x86-64 Xeon for a size-16k transform, r2c is about 60% faster than r2r. I'm assuming you've compiled FFTW with --enable-sse2. The difference gets even larger with FFTW 3.3 on new machines with AVX instructions.)

2) Try FFTW_PATIENT rather than FFTW_MEASURE. For smallish transforms, it doesn't make much difference, but for > 32k it can make a difference, and may be worth it if you are doing lots of of transforms of the same size.

Regards,
Steven G. Johnson
  Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about FFTW vs. gwnum? hangxanh Software 2 2014-10-24 15:48
FFTW vs. gwnum? ixfd64 Software 7 2011-09-20 16:05
llrpi : a full portable version of LLR Jean Penné Software 8 2011-01-29 21:44
idea: a portable version of Prime95 ixfd64 Software 3 2010-10-07 14:18
Prime95 on Playstation Portable jasong Software 4 2007-03-12 07:15

All times are UTC. The time now is 18:00.

Thu Jul 16 18:00:04 UTC 2020 up 113 days, 15:33, 2 users, load averages: 2.58, 1.78, 1.56

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.