mersenneforum.org Weird LLR FFT timings
 Register FAQ Search Today's Posts Mark Forums Read

 2007-10-10, 19:48 #1 MooooMoo Apprentice Crank     Mar 2006 2·227 Posts Weird LLR FFT timings A while ago, I tested k=19 for 601800 < n < 748200. The FFT length that LLR was using was 40960, and it took 0.983 ms per iteration. Later, I tested k=19 for 748200 < n < 891100. The FFT length that LLR was using was 49152, which was a 20% jump from the earlier FFT length of 40960. As expected, the time per iteration increased 20% to 1.182 ms per iteration. The testing of k=19 for 891100 < n < 1036200 increased the FFT length to 57344, a 16.7% jump from the earlier FFT length of 49152. Once again, the time per iteration increased about 16.7% to 1.427 ms per iteration. A few days ago, I tested n > 1036200 for this same k. The FFT length increased to 65536, a 14.3% jump from the earlier FFT length of 57344. However, the time per iteration increased less than 8%, and the time per iteration was only 1.540 ms instead of the expected 1.631ms. Does anyone know why?
2007-10-11, 04:25   #2
axn

Jun 2003

5,387 Posts

Quote:
 Originally Posted by MooooMoo Does anyone know why?
Power-of-two FFTs are more efficient than the others. It's the nature of FFT algorithm.

 2007-10-11, 08:56 #3 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2×7×461 Posts This is certainly accepted wisdom. I'm working in crystallography, and at one point I tested a large number of X*Y*Z three-dimensional FFTs using FFTW and found that 128*128*128, which I'd expected to be obviously the fastest, was 30% slower per voxel than 128*126*125. Code: There are fairly clearly four speed buckets; if you group by 'number of length-127 axes' that gives a lower bound on the speed (IE there are no fastest-buckets with any length 127, no second-fastest with two lengths 127). Actually, the fastest bucket looks to be just 'lengths all 125, 126 or 128'. It is bad if the short axis is 128, which is presumably the cache issue. It looks as if it's best for the axes to be in increasing order with speed, but that's not completely certain from the data. I suspect this is more an issue with the transpositions required for working on 3D FFTs than with the 1D FFT algorithm itself; large transpositions can be horribly cache-unfriendly.

 Similar Threads Thread Thread Starter Forum Replies Last Post nuggetprime Software 5 2011-02-21 08:28 axn Operation Billion Digits 1 2009-02-06 16:14 ewmayer Factoring 7 2008-12-11 22:12 paulunderwood 3*2^n-1 Search 14 2008-04-17 22:27 Prime95 Software 16 2005-03-04 17:48

All times are UTC. The time now is 08:33.

Fri Aug 12 08:33:44 UTC 2022 up 36 days, 3:21, 2 users, load averages: 1.40, 1.25, 1.20