mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2011-05-02, 00:14   #1
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5×359 Posts
Default LL and FFT question

I read under the math header on LL testing that after each and every LL iteration, the transformed result of squaring is inverse transformed back to a number, and the rounding errors from the FFT reconciled.

Question: Can the FFT be made exact? Obviously, this comes at a price, but bear with me for a moment -- the question arises because if the FFT can be made exact, then the inverse transform and re-transform at each step could be avoided, with an attendant, significant advantage, which might outweigh the price.

Corollary: IF the FFT can be made more exact, then fewer inverse transforms and retransforms will be required to perform an LL test. I suspect that the reduced number of transforms will offset the increased cost of the more precise FFT.

Can I get some help from experts to put numbers to this, and determine whether present practice as described above is indeed optimal?
Christenson is offline   Reply With Quote
Old 2011-05-02, 02:26   #2
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2D7F16 Posts
Default

This question of whether one can do multiple dyadic squaring steps before taking the inverse transform and doing a normalization step has been much-discussed. It is certainly possible, either with a pure-integer transform or extended-precision floating-point arithmetic, but because each squaring roughly doubles the number of bits which must be preserved, the short answer is that the cost of the software emulation needed for the scheme makes it not cost-effective compared to properly tuned transform-based math which uses operands having fast hardware-supported arithmetic and fills each word of the input vector with close to the maximum number of bits allowed by the resulting scheme.
ewmayer is online now   Reply With Quote
Old 2011-05-02, 10:33   #3
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5×359 Posts
Default

Wonder if it would be possible to do the normalization without doing the inverse transform? Can you point me at the literature?
Christenson is offline   Reply With Quote
Old 2011-05-02, 10:45   #4
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

185416 Posts
Default

Quote:
Originally Posted by Christenson View Post
Wonder if it would be possible to do the normalization without doing the inverse transform? Can you point me at the literature?
I would imagine that it is possible, but it would be mighty tricky and probably a lot slower than just the usual inverse and normalise.
retina is offline   Reply With Quote
Old 2011-05-03, 18:34   #5
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

265778 Posts
Default

Quote:
Originally Posted by Christenson View Post
Wonder if it would be possible to do the normalization without doing the inverse transform? Can you point me at the literature?
AFAIK it is impossible, or at least grossly impracticable.

On a somewhat-related note (this specifically has to do with the dyadic-square step in an LL-test setting): We can do the subtract-2 step on the forward-transformed vector, because the length-N FFT of a scalar is just an N-vector containing N copies of the scalar. But that would only be useful if staying in "FFT space" for multiple dyadic squarings were in some way advantageous. (Which it isn't, to the best of my knowledge).
ewmayer is online now   Reply With Quote
Old 2011-05-03, 19:30   #6
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

7×292 Posts
Default

Are you basically saying that it is possible to do the squaring and -2 without an inverse fft but not the modulus, meaning a doubling in the number of bits each iteration? How come modulus isn't possible when -2 is?
henryzz is offline   Reply With Quote
Old 2011-05-03, 20:11   #7
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

19·613 Posts
Default

I'm saying the subtract-constant step could be combined with the *dyadic* (pointwise) squaring, without requiring an iFFT. But we need the iFFT to be able to normalize and keep operands within exact-arithmetic-permitting size bounds, so it's moot.

The reason you can do the -2 step in FFT space is due to the simplicity of the data being FFTed here - a simple scalar constant.
ewmayer is online now   Reply With Quote
Old 2011-05-03, 23:20   #8
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5·359 Posts
Default

I'd say this differently:
The cheapest normalisation operation we know is to do the iFFT followed by bitwise normalization followed by the FFT.

Being able to add a constant in FFT space is a function of the linearity of the FFT -- just do the pointwise addition of FFT(-2).

And the much-discussed claim, which I'm still looking for the pointer to, is that the additional cost of being able to stay in FFT space for two steps, in the FFT stage, exceeds the cost of doing the less-precise iFFT after the multiply and re-doing the less precise FFT for the next multiply.
Christenson is offline   Reply With Quote
Old 2011-05-04, 00:28   #9
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101101011111112 Posts
Default

Quote:
Originally Posted by Christenson View Post
And the much-discussed claim, which I'm still looking for the pointer to, is that the additional cost of being able to stay in FFT space for two steps, in the FFT stage, exceeds the cost of doing the less-precise iFFT after the multiply and re-doing the less precise FFT for the next multiply.

Doing a second dyadic squaring doubles the size (bitset count) of your convolution outputs, which means (roughly) halving the allowable number of input bits to your forward FFT, which means doubling the length of the forward FFT. So you do half as many FFTs but each is double the length, which means slightly more than twice the per-FFT runtime. QED.
ewmayer is online now   Reply With Quote
Old 2011-05-04, 00:49   #10
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

34038 Posts
Default

OK, what happens to FFT size if I go four steps instead of two?

Also, does a GPU change the cost ratio, since doubling the FFT size might not double the time required?
Christenson is offline   Reply With Quote
Old 2011-05-04, 01:41   #11
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

19·613 Posts
Default

Quote:
Originally Posted by Christenson View Post
OK, what happens to FFT size if I go four steps instead of two?
Your convo outputs are 4x as many bits in size, so you trade 4 length-N FFT/iFFTs for one length-4N transform pair. Repeat after me: There is no free lunch. :)

Quote:
Also, does a GPU change the cost ratio, since doubling the FFT size might not double the time required?
Alternate hardware only changes the calculus if it affects one scheme differently than the other. Don't see how a GPU does that.
ewmayer is online now   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 22:41.


Fri Aug 6 22:41:06 UTC 2021 up 14 days, 17:10, 1 user, load averages: 4.25, 4.00, 3.64

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.