mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

msft 2012-01-29 17:10

Hi ,
CUFFT benchmark with CUDA3.0
[code]
CUFFT_Z2Z size=2048 k time=3.398650 msec
CUFFT_D2Z size=2048 k time=8.938929 msec
CUFFT_Z2Z size=2560 k time=14.965907 msec
CUFFT_D2Z size=2560 k time=42.769253 msec
CUFFT_Z2Z size=3072 k time=18.074261 msec
CUFFT_D2Z size=3072 k time=51.326923 msec
CUFFT_Z2Z size=3584 k time=23.994490 msec
CUFFT_D2Z size=3584 k time=65.803909 msec
CUFFT_Z2Z size=4096 k time=6.790013 msec
CUFFT_D2Z size=4096 k time=19.422380 msec
[/code]
Useless Ver 1.48 with CUDA3.0.

ET_ 2012-01-29 17:29

[QUOTE=msft;287656]Hi ,
CUFFT benchmark with CUDA3.0
[code]
CUFFT_Z2Z size=2048 k time=3.398650 msec
CUFFT_D2Z size=2048 k time=8.938929 msec
CUFFT_Z2Z size=2560 k time=14.965907 msec
CUFFT_D2Z size=2560 k time=42.769253 msec
CUFFT_Z2Z size=3072 k time=18.074261 msec
CUFFT_D2Z size=3072 k time=51.326923 msec
CUFFT_Z2Z size=3584 k time=23.994490 msec
CUFFT_D2Z size=3584 k time=65.803909 msec
CUFFT_Z2Z size=4096 k time=6.790013 msec
CUFFT_D2Z size=4096 k time=19.422380 msec
[/code]
Useless Ver 1.48 with CUDA3.0.[/QUOTE]

Understood :sad:

Let's wait for a GTX 600 GPU. It's worth a post in the happy thread.

Thank you for your quick answer and for all your work! :bow:

Luigi

LaurV 2012-02-06 18:11

Versions 1.4x seem to have trouble on multi-gpu configs (it seems like -D switch does not work with SLI, all copies start on the same GPU, regardless of wht -Dx I use). Versions 1.2b and 1.3alpha work well (and still faster, in spite of "powers of 2" FFT).

edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc?

Thanks a lot in advance!

flashjh 2012-02-06 18:48

[QUOTE=LaurV;288466]Versions 1.4x seem to have trouble on multi-gpu configs (it seems like -D switch does not work with SLI, all copies start on the same GPU, regardless of wht -Dx I use). Versions 1.2b and 1.3alpha work well (and still faster, in spite of "powers of 2" FFT).

edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc?

Thanks a lot in advance![/QUOTE]

I discovered that in Crossfire or SLI (or even regular) I had to use -D00 or -D01 instead of -D0 or -D1.

Brain 2012-02-06 19:52

[QUOTE=flashjh;288468]I discovered that in Crossfire or SLI (or even regular) I had to use -D00 or -D01 instead of -D0 or -D1.[/QUOTE]
Noted for GPU guide.

[QUOTE=LaurV;288466]
edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc?
[/QUOTE]
Please specify what you mean by 4.0/2.0 (CUDA 4.0/CC 2.0?). I'm just compiling and only have installed CUDA 4.0 and 4.1. Although I'm not happy with the 5% performance slowdown, I won't step back for future readiness. I'm hoping to benefit by the Kepler generation.
Currently, only msft is into the code so I cannot help with that. I hope that I can motivate msft to stay tuned for
1. performance
2. responsiveness
3. worktodo.txt feature (auto start next expo)
4. ETA feature
5. code readability
:hello:

If anybody is willing to help we have to sync with msft as ETA was coded but not merged and has gone again. I'm a software developer but no CUDA expert and the code is a bit "hard".

LaurV 2012-02-07 00:44

[QUOTE=Brain;288470]
Please specify what you mean by 4.0/2.0 (CUDA 4.0/CC 2.0?)[/QUOTE]
I mean (with your notation) CUDALucas.cuda4.0.sm_20.WIN64.exe, or better (to have the version too, I already renamed them something like this) CUDALucas.1.48.cuda4.0.sm_20.WIN64.exe :smile:

Brain 2012-02-07 05:34

1 Attachment(s)
[QUOTE=LaurV;288494]I mean (with your notation) CUDALucas.cuda4.0.sm_20.WIN64.exe, or better (to have the version too, I already renamed them something like this) CUDALucas.1.48.cuda4.0.sm_20.WIN64.exe :smile:[/QUOTE]
Here it is.

msft 2012-02-09 06:01

1 Attachment(s)
Ver 1.49
Fix -D1 issue.
Add ETA.

LaurV 2012-02-09 15:25

Thanks msft.
@Brain: could we have some builds? (I am interested in the same 4.0/2.0) then I will give a try for a couple of DCs and LLs (I will have for evaluation a gtx 580 GPU for few days, beside of my regulars).
Thanks in advance.

Brain 2012-02-09 19:28

1.49 Win64 SM 2.0 CUDA 4.0 compile, untested.
 
1 Attachment(s)
[QUOTE=LaurV;288766]Thanks msft.
@Brain: could we have some builds? (I am interested in the same 4.0/2.0) then I will give a try for a couple of DCs and LLs (I will have for evaluation a gtx 580 GPU for few days, beside of my regulars).
Thanks in advance.[/QUOTE]
Here we go.

Brain 2012-02-09 19:31

1.49 Win64 SM 2.1 CUDA 4.1 compile, untested.
 
1 Attachment(s)
Here we go again.


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.