![]() |
Hi ,
CUFFT benchmark with CUDA3.0 [code] CUFFT_Z2Z size=2048 k time=3.398650 msec CUFFT_D2Z size=2048 k time=8.938929 msec CUFFT_Z2Z size=2560 k time=14.965907 msec CUFFT_D2Z size=2560 k time=42.769253 msec CUFFT_Z2Z size=3072 k time=18.074261 msec CUFFT_D2Z size=3072 k time=51.326923 msec CUFFT_Z2Z size=3584 k time=23.994490 msec CUFFT_D2Z size=3584 k time=65.803909 msec CUFFT_Z2Z size=4096 k time=6.790013 msec CUFFT_D2Z size=4096 k time=19.422380 msec [/code] Useless Ver 1.48 with CUDA3.0. |
[QUOTE=msft;287656]Hi ,
CUFFT benchmark with CUDA3.0 [code] CUFFT_Z2Z size=2048 k time=3.398650 msec CUFFT_D2Z size=2048 k time=8.938929 msec CUFFT_Z2Z size=2560 k time=14.965907 msec CUFFT_D2Z size=2560 k time=42.769253 msec CUFFT_Z2Z size=3072 k time=18.074261 msec CUFFT_D2Z size=3072 k time=51.326923 msec CUFFT_Z2Z size=3584 k time=23.994490 msec CUFFT_D2Z size=3584 k time=65.803909 msec CUFFT_Z2Z size=4096 k time=6.790013 msec CUFFT_D2Z size=4096 k time=19.422380 msec [/code] Useless Ver 1.48 with CUDA3.0.[/QUOTE] Understood :sad: Let's wait for a GTX 600 GPU. It's worth a post in the happy thread. Thank you for your quick answer and for all your work! :bow: Luigi |
Versions 1.4x seem to have trouble on multi-gpu configs (it seems like -D switch does not work with SLI, all copies start on the same GPU, regardless of wht -Dx I use). Versions 1.2b and 1.3alpha work well (and still faster, in spite of "powers of 2" FFT).
edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc? Thanks a lot in advance! |
[QUOTE=LaurV;288466]Versions 1.4x seem to have trouble on multi-gpu configs (it seems like -D switch does not work with SLI, all copies start on the same GPU, regardless of wht -Dx I use). Versions 1.2b and 1.3alpha work well (and still faster, in spite of "powers of 2" FFT).
edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc? Thanks a lot in advance![/QUOTE] I discovered that in Crossfire or SLI (or even regular) I had to use -D00 or -D01 instead of -D0 or -D1. |
[QUOTE=flashjh;288468]I discovered that in Crossfire or SLI (or even regular) I had to use -D00 or -D01 instead of -D0 or -D1.[/QUOTE]
Noted for GPU guide. [QUOTE=LaurV;288466] edit: any chances to get a 4.0/2.0 win64 build? (like for a Tesla?) And any chances to get real/ETA times, like for v1.3alpha_eoc? [/QUOTE] Please specify what you mean by 4.0/2.0 (CUDA 4.0/CC 2.0?). I'm just compiling and only have installed CUDA 4.0 and 4.1. Although I'm not happy with the 5% performance slowdown, I won't step back for future readiness. I'm hoping to benefit by the Kepler generation. Currently, only msft is into the code so I cannot help with that. I hope that I can motivate msft to stay tuned for 1. performance 2. responsiveness 3. worktodo.txt feature (auto start next expo) 4. ETA feature 5. code readability :hello: If anybody is willing to help we have to sync with msft as ETA was coded but not merged and has gone again. I'm a software developer but no CUDA expert and the code is a bit "hard". |
[QUOTE=Brain;288470]
Please specify what you mean by 4.0/2.0 (CUDA 4.0/CC 2.0?)[/QUOTE] I mean (with your notation) CUDALucas.cuda4.0.sm_20.WIN64.exe, or better (to have the version too, I already renamed them something like this) CUDALucas.1.48.cuda4.0.sm_20.WIN64.exe :smile: |
1 Attachment(s)
[QUOTE=LaurV;288494]I mean (with your notation) CUDALucas.cuda4.0.sm_20.WIN64.exe, or better (to have the version too, I already renamed them something like this) CUDALucas.1.48.cuda4.0.sm_20.WIN64.exe :smile:[/QUOTE]
Here it is. |
1 Attachment(s)
Ver 1.49
Fix -D1 issue. Add ETA. |
Thanks msft.
@Brain: could we have some builds? (I am interested in the same 4.0/2.0) then I will give a try for a couple of DCs and LLs (I will have for evaluation a gtx 580 GPU for few days, beside of my regulars). Thanks in advance. |
1.49 Win64 SM 2.0 CUDA 4.0 compile, untested.
1 Attachment(s)
[QUOTE=LaurV;288766]Thanks msft.
@Brain: could we have some builds? (I am interested in the same 4.0/2.0) then I will give a try for a couple of DCs and LLs (I will have for evaluation a gtx 580 GPU for few days, beside of my regulars). Thanks in advance.[/QUOTE] Here we go. |
1.49 Win64 SM 2.1 CUDA 4.1 compile, untested.
1 Attachment(s)
Here we go again.
|
| All times are UTC. The time now is 23:08. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.