20080113, 13:36  #1 
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17·251 Posts 
Variable FFT sizes
Why are the FFT sizes only a few fixed sizes? Why not have them variable, only being as large as they need to to do the number without errors. As mentioned in The Happy Me(al) thread, I barely got one that was just outside of an FFT size to use the smaller one, and I have one that's on the lower end of its FFT size. Surely it doesn't need all 2560K FFT size precision to do something 39.78M? Even if 2048K is indeed too inaccurate for the number, there must be something in between those that will give sufficient accuracy and higher speed.
To see what FFT size it needs, perhaps a ~1 hour selftest running the number at different FFT sizes, picking the smallest one that's within acceptable accuracy limits. If not variable, perhaps many sizes, such as 1 per ~12M instead of 1 per ~510M. In something only somewhatrelated, is there a way I can get Prime95 to check if my 39.78M will run within acceptable limits on 2048K? It did this automatically, and passed, with my 39.509M number, and I just wanted to make sure I'm not running my other one at a larger FFT size if it's not needed. 
20080113, 15:08  #2 
"Lucan"
Dec 2006
England
2×3×13×83 Posts 
If you ask Primenet for an exponent NOW, you should
get one <39.5M. A cursory study of how a FFT works should explain why the optimum size is a power of 2, as will a glance at the iteration times for more awkward sizes. I assume that this trend continues for sizes less smooth than (2^19)*5. Not to mention that I suspect the programming of each size has been customized by George! David It's similar to the reason why tennis tournaments start with 64 or 128 players. Last fiddled with by davieddy on 20080113 at 15:34 
20080113, 18:44  #3  
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17×251 Posts 
Quote:
Quote:
I've never noticed how the iteration times are with powers of two before, but now I see that. I guess variable sizes really would be worse. Anyone know about forcing the selftest to see if 2048K can run the number? 

20080114, 12:51  #4  
"Lucan"
Dec 2006
England
2·3·13·83 Posts 
Quote:
"Fast" enters into the Fast (Discrete) Fourier Transform. These diagrams are what resemble the draw for a tennis tournament. I agree that a full understanding of the subject cannot be gained by "cursory study". 

20080114, 16:37  #5  
∂^{2}ω=0
Sep 2002
República de California
5·7·331 Posts 
Quote:
1) The relative efficiency of an FFT is [roughly] related to the smoothness of the factorization of its length  thus powerof2 radices are most efficient, and largeoddprime radices are worst; 2) Larger radices lead to higher register pressure inhardware, thus coding them to run efficiently becomes quite difficult. The above is just a bare outline: feel free to do some homework on the DFT if you wish to understand this more fully. For example, a good advancedtopics question is: "For odd composite radices of similar size, why might one with no repeated factors be preferable to one with repeated factors? [E.g. radix 21 vs 25]." Last fiddled with by ewmayer on 20080114 at 16:42 

20080114, 17:16  #6  
(loop (#_fork))
Feb 2006
Cambridge, England
6379_{10} Posts 
Quote:
Is there some obvious reason that I'm missing why fast FFT code tends to be {small}*2^N rather than {small}*2^N*3^M? The threeway decimationintime stage isn't too hideous, the cube roots of unity are conveniently 1 and +/ omega. 

20080114, 17:33  #7 
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17×251 Posts 
Thanks for all the explanations, I understand now that it takes a very smooth FFT size to be efficient.
I got this error on my 39.509M number and was wondering if it's serious or a software bug or anything, and if this significantly raises the chance that this will return an incorrect result (copied from results.txt, with the selftest checking if 2048K is acceptable): Code:
Trying 1000 iterations for exponent 39509597 using 2048K FFT. If average roundoff error is above 0.2425, then a larger FFT will be used. Final average roundoff error is 0.24118, using 2048K FFT for exponent 39509597. [Mon Jan 14 01:17:53 2008] Iteration: 1420185/39509597, ERROR: ROUND OFF (0.40625) > 0.40 Continuing from last save file. [Mon Jan 14 01:42:17 2008] Disregard last error. Result is reproducible and thus not a hardware problem. For added safety, redoing iteration using a slower, more reliable method. Continuing from last save file. Last fiddled with by MiniGeek on 20080114 at 17:43 Reason: clarifying part of second sentence 
20080114, 17:33  #8  
∂^{2}ω=0
Sep 2002
República de California
5×7×331 Posts 
Quote:
Quote:
EDIT: By way of example, I can do a complex radix8 DFT [sans any multipass twiddles] with just 52 [scalar] add and 4 mul. A radix9 DFT, on the other hand, needs 84 add and 20 mul. That is significantly more opsperinput than the power of 2. OTOOH, if I follow that radix9 with a large power of 2, the extra work done in doing the one radix9 pass gets offset [and more] by the shorter vector length compared to [say] going to the nextlarger power of 2, or even to 10*[power of 2]. The radix10 DFT needs 88 add and 20 mul and thus is slightly betterperinput, but that work gets multiplied by [10/9] in comparing vs 9*[same power of 2] to reflect the overall larger vector length. Of course in the real world there is much more to it than mere arithmetic opcount  cache misses and memory bandwidth are much more important at larger vector lengyhs  but all other things being equal [which the same largepowerof2 ensures] using a shorter vector length will usually give better cache behavior, as well, since proportionally more of the data array can fit in cache. Last fiddled with by ewmayer on 20080114 at 19:08 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Let y^2=xzx^2+1, and if value of x is known, can y and z be directly calculated? Given all variable  MARTHA  Number Theory Discussion Group  11  20180302 15:55 
Exponentiation w/ independent variable  Unregistered  Homework Help  4  20100804 06:38 
FFT sizes  Cruelty  Riesel Prime Search  3  20060712 15:15 
How Much Memory at Various Sizes?  wblipp  GMPECM  5  20050424 20:04 
Cache Sizes  Unregistered  Hardware  4  20031206 13:43 