20100510, 09:12  #166  


 
Very Sorry, wavelet3000
Please change MaclucasFFTW.cu. Quote:


20100510, 09:40  #167 


 

20100514, 23:32  #168 


 
Hi,
Version "Q" at .0106 sec/iter for the 2048K FFT , .0214 sec/iter for the 4096K FFT , .0432 sec/iter for the 8192K FFT and .0895 sec/ier for the 16384K FFT on GTX260. 
20100515, 20:08  #169 

 
With version Q, 64bit works fine, no problem with 44497, 110203 or other numbers.
Thanks very much. 
20100515, 23:15  #170 


 
As expected,
M( 42643801 )P, n = 4194304, MacLucasFFTW v8.1 Ballester in just over 5 days. Although unintended, this also tested the restart code. I had the program running in a terminal (and not using screen) on a Windows machine. Windows update decided to reboot the computer, closing the terminal and stopping the program. The restart worked just as it should. 
20100516, 01:12  #171 


 

20100516, 12:56  #172 




 

20100516, 15:14  #173 



 
At least close to.
Glucas 2.9.220080916 + dualsocket Xeon X5680 (3.33GHz hexacore): 2048k FFT: 4.7ms per iteration With a Teslabrandet Fermi (all DPunits enabled) you should beat this easily. It looks like Glucas doesn't scale as good a your code on increasing FFT sizes (at least on this system) so perhaps you're allready faster for bigger FFTs. On the other hand Glucas supports much more FFT sizes. Good job msft! 
20100518, 10:38  #174 


 
CUDA 3.1beta is now out. Among the highlights is this little gem:
* Significant improvements in doubleprecision FFT performance on Fermiarchitecture GPUs for 2^n transform sizes Sure enough, the GTX 480 now runs at 4.66 ms/iter for the 2048K FFT and 9.37 ms/iter for the 4096K FFT. 
20100602, 04:15  #176 


 
I've been busy testing the 2M FFT on Fermi:
Code:
M( 30000037 )C, 0x307be1a2dc2bca38, n = 2097152, MacLucasFFTW v8.1 Ballester M( 31000003 )C, 0x9bed7651387bd02a, n = 2097152, MacLucasFFTW v8.1 Ballester M( 32000057 )C, 0x60bbddb7958f85e3, n = 2097152, MacLucasFFTW v8.1 Ballester M( 33000001 )C, 0xe54b0c721739183f, n = 2097152, MacLucasFFTW v8.1 Ballester M( 34000081 )C, 0x64415a7a626f0e34, n = 2097152, MacLucasFFTW v8.1 Ballester M( 35000443 )C, 0xbf2fb6ccbc3f8780, n = 2097152, MacLucasFFTW v8.1 Ballester M( 36000143 )C, 0xb0be92372eeab565, n = 2097152, MacLucasFFTW v8.1 Ballester 
