mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

msft 2012-02-13 06:01

[QUOTE=kjaget;289144]They're missing on lines 1525 and 1587 of rw.cu in the version 1.49 I downloaded from post 723. Also, the one at line 1494 is newer than my old code, so it may or may not need one as well. At least that's the big difference between my version 1.2.whatever and the current builds. Not sure if you're updating the source before building which may explain why you're seeing it there.[/QUOTE]
I has deleted line from rw.cu with fix #673 issue.
This line check fft length with old MacLucasFFTW.

ckdo 2012-02-13 12:59

[QUOTE=Brain;289141]I'm ashamed for this question but does someone feed CUDALucas with something else than the command line options? In other words, is it possible to let CL immediately start the next expo when finishing the latest?[/QUOTE]

As in ... ? :unsure:

[code]./CUDALucas [options] [expo1] [expo2] [expo3] [...][/code]

msft 2012-02-13 13:28

1 Attachment(s)
[QUOTE=kladner;289153]I don't currently run CUDALucas. However, there were discussions a while back of loading a stack of assignment command lines into a batch file to feed CL. [/QUOTE]
Fix this issue.

LaurV 2012-02-14 00:42

c'mon, you make new version just when the old one gave another two successful tests :smile:, now should I have to start testing the new one, or what? :P

[CODE]Processing result: M( 26027971 )C, 0xff6f4a6a0121131f, n = 1572864, CUDALucas v1.49
LL test successfully completes double-check of M26027971
CPU credit is 25.8834 GHz-days.
Processing result: M( 26176511 )C, 0xa3af29f3466535cf, n = 1572864, CUDALucas v1.49
LL test successfully completes double-check of M26176511
CPU credit is 26.0311 GHz-days.[/CODE]
P.S. binaries?

flashjh 2012-02-14 04:00

[QUOTE=LaurV;289323]P.S. binaries?[/QUOTE]

[SIZE=3][FONT=Calibri]Eagerly awaiting... :smile:[/FONT][/SIZE]

Brain 2012-02-14 11:28

1.50 Win64 CUDA 4.0 SM 2.0 compile, untested.
 
1 Attachment(s)
1.50 Win64 CUDA 4.0 SM 2.0 compile, untested.

Brain 2012-02-14 11:32

1.50 Win64 CUDA 4.1 SM 2.1 compile, untested.
 
1 Attachment(s)
1.50 Win64 CUDA 4.1 SM 2.1 compile, untested.

Brain 2012-02-14 18:20

ETA, nice...
 
1 Attachment(s)
We now have a nice, well-formatted ETA in 1.50.

flashjh 2012-02-14 19:31

[QUOTE=msft;289269]Fix this issue.[/QUOTE]


[QUOTE=Brain;289361]1.50 Win64 CUDA 4.1 SM 2.1 compile, untested.[/QUOTE]

[QUOTE=Brain;289379]We now have a nice, well-formatted ETA in 1.50.[/QUOTE]

Thank you!

LaurV 2012-02-15 02:13

Version v1.50 successfully proved both 2^756839-1 and 2^859433-1 being primes, and also it tested other 3 DC in 26M range and confirmed the residue (partially tests, resumed, about half of the way). I think I will switch to it for "production".

As seen improvements compared to 1.48, is showing the nice ETA in HMS format, new feature to work in "batch mode", and most important, correctly identifying and assigning work to the right GPU, in a multiple GPU system (otherwise it won't make too much sense, would it? :D).

As "still unfixed" remark, the -c switch still does not work for screen (display every 10k iterations, regardless of the given parameter - I know this was fixed in some former versions, like 1.3), and of course, the -? option still display the same gibberish.

From the speed point of view, this version has about the same speed as the older versions, in average. Most probably what is gain from reduced FFT size is lost by going to the slower drivers. In fact, [B]about one third of the expos will run slower[/B] (the expos having about the same FFT size as the powers-of-two version, here the FFT was reduced only a little or not at all), [B]about a second third of the expos will run with the same speed[/B], and [B]the last third of the expos[/B], where the FFT size was really reduced, [B]will run faster[/B]. For the range of the expos we are currently dealing for DC's (26M range) the speed is about [B]8% faster[/B] (2 x gtx580, slightly overclocked, 5.12 ms/iter instead of 5.55 with v1.3alpha_eoc).

Thanks a lot Msft and Brain.

flashjh 2012-02-16 01:07

I just finished another exponent with 1.50 and it force closed again. Maybe it's my batch file?

[CODE]
e:
cd e:\cuda
e:\cuda\cuda150 -t -c10000 26206XXX
e:\cuda\cuda150 -t -c10000 26206XXX
[/CODE]

Can anyone confirm if 1.50 is going onto the next exponent for them and how you're setting up you're batch file?

Thanks


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.