![]() |
|
|
#1 |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17·251 Posts |
Why are the FFT sizes only a few fixed sizes? Why not have them variable, only being as large as they need to to do the number without errors. As mentioned in The Happy Me(al) thread, I barely got one that was just outside of an FFT size to use the smaller one, and I have one that's on the lower end of its FFT size. Surely it doesn't need all 2560K FFT size precision to do something 39.78M? Even if 2048K is indeed too inaccurate for the number, there must be something in between those that will give sufficient accuracy and higher speed.
To see what FFT size it needs, perhaps a ~1 hour self-test running the number at different FFT sizes, picking the smallest one that's within acceptable accuracy limits. If not variable, perhaps many sizes, such as 1 per ~1-2M instead of 1 per ~5-10M. In something only somewhat-related, is there a way I can get Prime95 to check if my 39.78M will run within acceptable limits on 2048K? It did this automatically, and passed, with my 39.509M number, and I just wanted to make sure I'm not running my other one at a larger FFT size if it's not needed. |
|
|
|
|
|
#2 |
|
"Lucan"
Dec 2006
England
2×3×13×83 Posts |
If you ask Primenet for an exponent NOW, you should
get one <39.5M. A cursory study of how a FFT works should explain why the optimum size is a power of 2, as will a glance at the iteration times for more awkward sizes. I assume that this trend continues for sizes less smooth than (2^19)*5. Not to mention that I suspect the programming of each size has been customized by George! David It's similar to the reason why tennis tournaments start with 64 or 128 players. Last fiddled with by davieddy on 2008-01-13 at 15:34 |
|
|
|
|
|
#3 | ||
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17·251 Posts |
Quote:
Quote:
I've never noticed how the iteration times are with powers of two before, but now I see that. I guess variable sizes really would be worse. Anyone know about forcing the self-test to see if 2048K can run the number? |
||
|
|
|
|
|
#4 | |
|
"Lucan"
Dec 2006
England
2×3×13×83 Posts |
Quote:
"Fast" enters into the Fast (Discrete) Fourier Transform. These diagrams are what resemble the draw for a tennis tournament. I agree that a full understanding of the subject cannot be gained by "cursory study". |
|
|
|
|
|
|
#5 | |
|
∂2ω=0
Sep 2002
República de California
265678 Posts |
Quote:
1) The relative efficiency of an FFT is [roughly] related to the smoothness of the factorization of its length - thus power-of-2 radices are most efficient, and large-odd-prime radices are worst; 2) Larger radices lead to higher register pressure in-hardware, thus coding them to run efficiently becomes quite difficult. The above is just a bare outline: feel free to do some homework on the DFT if you wish to understand this more fully. For example, a good advanced-topics question is: "For odd composite radices of similar size, why might one with no repeated factors be preferable to one with repeated factors? [E.g. radix 21 vs 25]." Last fiddled with by ewmayer on 2008-01-14 at 16:42 |
|
|
|
|
|
|
#6 | |
|
(loop (#_fork))
Feb 2006
Cambridge, England
72×131 Posts |
Quote:
Is there some obvious reason that I'm missing why fast FFT code tends to be {small}*2^N rather than {small}*2^N*3^M? The three-way decimation-in-time stage isn't too hideous, the cube roots of unity are conveniently 1 and +/- omega. |
|
|
|
|
|
|
#7 |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
10000101010112 Posts |
Thanks for all the explanations, I understand now that it takes a very smooth FFT size to be efficient.
I got this error on my 39.509M number and was wondering if it's serious or a software bug or anything, and if this significantly raises the chance that this will return an incorrect result (copied from results.txt, with the self-test checking if 2048K is acceptable): Code:
Trying 1000 iterations for exponent 39509597 using 2048K FFT. If average roundoff error is above 0.2425, then a larger FFT will be used. Final average roundoff error is 0.24118, using 2048K FFT for exponent 39509597. [Mon Jan 14 01:17:53 2008] Iteration: 1420185/39509597, ERROR: ROUND OFF (0.40625) > 0.40 Continuing from last save file. [Mon Jan 14 01:42:17 2008] Disregard last error. Result is reproducible and thus not a hardware problem. For added safety, redoing iteration using a slower, more reliable method. Continuing from last save file. Last fiddled with by Mini-Geek on 2008-01-14 at 17:43 Reason: clarifying part of second sentence |
|
|
|
|
|
#8 | ||
|
∂2ω=0
Sep 2002
República de California
2D7716 Posts |
Quote:
Quote:
EDIT: By way of example, I can do a complex radix-8 DFT [sans any multipass twiddles] with just 52 [scalar] add and 4 mul. A radix-9 DFT, on the other hand, needs 84 add and 20 mul. That is significantly more ops-per-input than the power of 2. OTOOH, if I follow that radix-9 with a large power of 2, the extra work done in doing the one radix-9 pass gets offset [and more] by the shorter vector length compared to [say] going to the next-larger power of 2, or even to 10*[power of 2]. The radix-10 DFT needs 88 add and 20 mul and thus is slightly better-per-input, but that work gets multiplied by [10/9] in comparing vs 9*[same power of 2] to reflect the overall larger vector length. Of course in the real world there is much more to it than mere arithmetic opcount - cache misses and memory bandwidth are much more important at larger vector lengyhs - but all other things being equal [which the same large-power-of-2 ensures] using a shorter vector length will usually give better cache behavior, as well, since proportionally more of the data array can fit in cache. Last fiddled with by ewmayer on 2008-01-14 at 19:08 |
||
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Let y^2=xz-x^2+1, and if value of x is known, can y and z be directly calculated? Given all variable | MARTHA | Number Theory Discussion Group | 11 | 2018-03-02 15:55 |
| Exponentiation w/ independent variable | Unregistered | Homework Help | 4 | 2010-08-04 06:38 |
| FFT sizes | Cruelty | Riesel Prime Search | 3 | 2006-07-12 15:15 |
| How Much Memory at Various Sizes? | wblipp | GMP-ECM | 5 | 2005-04-24 20:04 |
| Cache Sizes | Unregistered | Hardware | 4 | 2003-12-06 13:43 |