mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

msft 2010-05-10 09:12

Very Sorry, wavelet3000

Please change MaclucasFFTW.cu.
[quote]
890 //64bitOS bigA=6755399441055744.0;
891 //32bitOS bigA=(((6.0)*0x2000000L)*0x2000000L)*0x800;
892 bigA=6755399441055744.0;
[/quote]

msft 2010-05-10 09:40

[quote=frmky;214512]The 2048K FFT runs at 5.47 ms/iteration,[/quote]
Very Good!!!:max:

msft 2010-05-14 23:32

1 Attachment(s)
Hi,
Version "Q" at .0106 sec/iter for the 2048K FFT , .0214 sec/iter for the 4096K FFT , .0432 sec/iter for the 8192K FFT and .0895 sec/ier for the 16384K FFT on GTX260.

wavelet3000 2010-05-15 20:08

With version Q, 64-bit works fine, no problem with 44497, 110203 or other numbers.

Thanks very much.

frmky 2010-05-15 23:15

As expected,
M( 42643801 )P, n = 4194304, MacLucasFFTW v8.1 Ballester
in just over 5 days. Although unintended, this also tested the restart code. I had the program running in a terminal (and not using screen) on a Windows machine. Windows update decided to reboot the computer, closing the terminal and stopping the program. The restart worked just as it should.

msft 2010-05-16 01:12

[quote=frmky;215104]As expected,
M( 42643801 )P, n = 4194304, MacLucasFFTW v8.1 Ballester
in just over 5 days. [/quote]
Your GTX480 is fastest computer on mersenne community today!:smile:

henryzz 2010-05-16 12:56

[quote=msft;215106]Your GTX480 is fastest computer on mersenne community today!:smile:[/quote]
Yes
Could be good for proving a new mersenne prime quickly.

TheJudger 2010-05-16 15:14

[QUOTE=msft;215106]Your GTX480 is fastest computer on mersenne community today!:smile:[/QUOTE]

At least close to. :razz:
Glucas 2.9.2-20080916 + dualsocket Xeon X5680 (3.33GHz hexacore):
2048k FFT: 4.7ms per iteration :smile:

With a Tesla-brandet Fermi (all DP-units enabled) you should beat this easily.

It looks like Glucas doesn't scale as good a your code on increasing FFT sizes (at least on this system) so perhaps you're allready faster for bigger FFTs. On the other hand Glucas supports much more FFT sizes.

Good job msft! :smile:

frmky 2010-05-18 10:38

[QUOTE=TheJudger;215154]At least close to. :razz:
[/QUOTE]
CUDA 3.1-beta is now out. Among the highlights is this little gem:
* Significant improvements in double-precision FFT performance on Fermi-architecture GPUs for 2^n transform sizes

Sure enough, the GTX 480 now runs at 4.66 ms/iter for the 2048K FFT and 9.37 ms/iter for the 4096K FFT. :razz:

wavelet3000 2010-06-01 12:33

[URL="http://www.mersenne.org/report_exponent/?exp_lo=68808029&exp_hi=68808029&B1=Get+status"]M68808029[/URL]
(4M fft)
Took 16 days on GTX280 and yielded 194 GHz-days. I am not sure if I will have the patience to double check it soon. In the meantime, I off to testing 88M-range exponent (with 8M fft) and shopping for Fermi :flex:

frmky 2010-06-02 04:15

I've been busy testing the 2M FFT on Fermi:
[CODE]M( 30000037 )C, 0x307be1a2dc2bca38, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 31000003 )C, 0x9bed7651387bd02a, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 32000057 )C, 0x60bbddb7958f85e3, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 33000001 )C, 0xe54b0c721739183f, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 34000081 )C, 0x64415a7a626f0e34, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 35000443 )C, 0xbf2fb6ccbc3f8780, n = 2097152, MacLucasFFTW v8.1 Ballester
M( 36000143 )C, 0xb0be92372eeab565, n = 2097152, MacLucasFFTW v8.1 Ballester[/CODE]
37000133 is running now.


All times are UTC. The time now is 13:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.