![]() |
![]() |
#1 |
Apprentice Crank
Mar 2006
2·227 Posts |
![]()
A while ago, I tested k=19 for 601800 < n < 748200. The FFT length that LLR was using was 40960, and it took 0.983 ms per iteration.
Later, I tested k=19 for 748200 < n < 891100. The FFT length that LLR was using was 49152, which was a 20% jump from the earlier FFT length of 40960. As expected, the time per iteration increased 20% to 1.182 ms per iteration. The testing of k=19 for 891100 < n < 1036200 increased the FFT length to 57344, a 16.7% jump from the earlier FFT length of 49152. Once again, the time per iteration increased about 16.7% to 1.427 ms per iteration. A few days ago, I tested n > 1036200 for this same k. The FFT length increased to 65536, a 14.3% jump from the earlier FFT length of 57344. However, the time per iteration increased less than 8%, and the time per iteration was only 1.540 ms instead of the expected 1.631ms. Does anyone know why? |
![]() |
![]() |
![]() |
#2 |
Jun 2003
5,387 Posts |
![]() |
![]() |
![]() |
![]() |
#3 |
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
![]()
This is certainly accepted wisdom. I'm working in crystallography, and at one point I tested a large number of X*Y*Z three-dimensional FFTs using FFTW and found that 128*128*128, which I'd expected to be obviously the fastest, was 30% slower per voxel than 128*126*125.
Code:
There are fairly clearly four speed buckets; if you group by 'number of length-127 axes' that gives a lower bound on the speed (IE there are no fastest-buckets with any length 127, no second-fastest with two lengths 127). Actually, the fastest bucket looks to be just 'lengths all 125, 126 or 128'. It is bad if the short axis is 128, which is presumably the cache issue. It looks as if it's best for the axes to be in increasing order with speed, but that's not completely certain from the data. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Gimme Atom timings! | nuggetprime | Software | 5 | 2011-02-21 08:28 |
New proggy and timings | axn | Operation Billion Digits | 1 | 2009-02-06 16:14 |
Need GMP trial-division timings | ewmayer | Factoring | 7 | 2008-12-11 22:12 |
321 LLR timings | paulunderwood | 3*2^n-1 Search | 14 | 2008-04-17 22:27 |
AMD64 opcode timings | Prime95 | Software | 16 | 2005-03-04 17:48 |