mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

Brain 2012-02-22 18:57

Back from job
 
1 Attachment(s)
CUDALucas 1.54 compiled for Win64 4.0/2.0 and 4.1/2.1 and new Makefile for Win64.
All untested.

msft 2012-02-22 20:18

Hi ,LaurV
Residues is same to MLucas.
Thank you,

Brain 2012-02-23 05:27

Perfect
 
CUDALucas vs Residues: 12:0, another good DC with 1.49/1.50:
[CODE]Processing result: M( 50806499 )C, 0x1d11ecca3b052704, n = 3145728, CUDALucas v1.50 LL test successfully completes double-check of M50806499[/CODE]Switching to DCs with 1.54.

Brain 2012-02-23 06:13

Gone?
 
Hi msft,

at first sight it looks the the checkpoint file does not exist per exponent any more, only cudalucas.ckpt. I'd prefer the former way.
Additionally, could we also have a second backup file, maybe like it was before: c<exponent>, t<exponent>? I needed it once.
I like CL being more compact with 1 single file.
Thanks for your work.

By the way, to the others, -c and -t switches are gone. I assume we get checkpoints hard coded every 10000 iterations.

msft 2012-02-23 07:40

1 Attachment(s)
Hi ,
Ver 1.55
change check point file name.
change print format.
[code]
Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.55 (2.6 msec/Iter ETA 8:55)
[/code]

Karl M Johnson 2012-02-23 11:06

I tried the latest binary Brain provided - no matter what, I cannot assign work to second GPU.
I have 2 GTX 480, and I successfully run a lot of apps on 2nd GPU while playing(both those, which support multi-gpu and dont).
I've tried -d 1(since in CUDA, 1st GPU is always 0), -d 2 and -d 1.
No matter what, CUDALucas always uses first GPU.
Is this a bug ?

LaurV 2012-02-23 12:33

[QUOTE=Karl M Johnson;290539]I tried the latest binary Brain provided - no matter what, I cannot assign work to second GPU.
I have 2 GTX 480, and I successfully run a lot of apps on 2nd GPU while playing(both those, which support multi-gpu and dont).
I've tried -d 1(since in CUDA, 1st GPU is always 0), -d 2 and -d 1.
No matter what, CUDALucas always uses first GPU.
Is this a bug ?[/QUOTE]
Contrary to mfaktc, in CudaLucas should be no space between the D and the 0 (or 1, 2, etc) and you must use uppercase (-d in lowercase is suppose to do I don't know what trick with the output as explained in the help, thing which never worked or was useful).

So, you have to try something like
[CODE]>CudaLucas.xx.bla.bla.bla -c30000 -t -D0 240007
[/CODE]
For me it worked.

msft 2012-02-23 12:47

Hi ,Kari M Johnson
[QUOTE=Karl M Johnson;290539]I tried the latest binary Brain provided - no matter what, I cannot assign work to second GPU.
I have 2 GTX 480, and I successfully run a lot of apps on 2nd GPU while playing(both those, which support multi-gpu and dont).
I've tried -d 1(since in CUDA, 1st GPU is always 0), -d 2 and -d 1.
No matter what, CUDALucas always uses first GPU.
Is this a bug ?[/QUOTE]
Can you try?
[code]
$ ./CUDALucas -d 1 216091
[/code]

Brain 2012-02-23 18:29

1.55 Win64 CUDA 4.0/4.1 SM 2.0/2.1 compile, untested.
 
1 Attachment(s)
[QUOTE=msft;290524]Hi ,
Ver 1.55
change check point file name.
change print format.
[/QUOTE]
1.55 Win64 CUDA 4.0 / SM 2.0 & CUDA 4.1 SM 2.1 compiles, both untested.

Karl M Johnson 2012-02-23 22:39

[QUOTE=msft;290549]Hi ,Kari M Johnson

Can you try?
[code]
$ ./CUDALucas -d 1 216091
[/code][/QUOTE]

Haha!:smile:
It works, thanks!
The most important detail I forgot to mention - the device flag doesnt work along with "-r" flag, which is the residue test.
But for a regular exponent, everything's fine.

apsen 2012-02-23 23:23

1 Attachment(s)
[QUOTE=msft;290524]
Ver 1.55
[/QUOTE]

I've modified this version as follows:

1) changed time calculation logic to use more precise time functions - now time per iteration calculation precision is better then 0.1 msec

2) SDK samples dependencies are now in included file - no more need to pull SDK samples to compile.

3) Merged in Etan's transpose optimization - it seems it gives some marginal improvement but ms/iter of previous version is not good enough to make conclusions.


BTW Ctrl-C handler lost in 1.52 needs to be brought back :-(


All times are UTC. The time now is 23:09.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.