mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   llrCUDA (https://www.mersenneforum.org/showthread.php?t=14608)

msft 2011-02-22 04:41

gwpnumi.cu:
//msft #define MAXBITSPERDOUBLE (double)35
#define MAXBITSPERDOUBLE (double)44

This number deside FFT length Boundary.

$ ./llrCUDA -d -q237019*2^6100018+1 -oDebug=1
k = 237019.0, b = 2, n = 6100018, c = 1, bit_length = 6100036
FFTLEN = 2097152, bpw = 3.908723, Bits per double = 40.857052, Maxbpd = 44.000000

llr compare "Bits per double" and MAXBITSPERDOUBLE.

you can change MAXBITSPERDOUBLE.(on your risk):lol:

mdettweiler 2011-02-22 18:30

Eureka! We have a good result with 0.59:
[code]
gary@herford:~/Desktop/gpu-stuff/llrcuda$ ./llrCUDA -d -q237019*2^6100018+1
Resuming Proth prime test of 237019*2^6100018+1 at bit 1439745 [23.60%]


237019*2^6100018+1 is not prime. Proth RES64: 34687837ED148D74 Time : 63455.118 sec.
[/code]
It sure took its sweet time compared to earlier versions (17+ hours, if I'm correctly assuming that llrCUDA behaves like regular LLR in that it saves the Time figure when stopping and resuming) due to the larger FFT, but it got the result correct. And it still took somewhat less time than ltd's original CPU result:
[code]
[2010-12-24 19:44:47 WEST] Candidate: 237019*2^6100018+1 Program: llr.exe Residue: 34687837ED148D74 Time: 95109 seconds
[/code]
Now the big question: is there a way to recoup the speed of earlier versions without sacrificing result quality? :smile:

msft 2011-02-24 12:39

1 Attachment(s)
Support affinity.

msft 2011-02-25 07:14

GTX460:
4847*2^3321063+1 is prime! Time : 8745.412 sec.
5359*2^5054502+1 is prime! Time : 26657.372 sec.
19249*2^13018586+1 is prime! Time : 139584.549 sec.:smile:

Ken_g6 2011-02-26 06:14

[QUOTE=msft;253657]GTX460:
4847*2^3321063+1 is prime! Time : 8745.412 sec.
5359*2^5054502+1 is prime! Time : 26657.372 sec.
19249*2^13018586+1 is prime! Time : 139584.549 sec.:smile:[/QUOTE]

Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow:

Ralf Recker 2011-02-26 06:26

[QUOTE=Ken_g6;253795]Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow:[/QUOTE]
Nice :smile:. Two "current" SoBs I ran in January required 5.7 days on my Q9550 @3.4 GHz (492,282.90s wall clock / 489,230.18s CPU time, 6 MB L2 cache for each task to speed things up a little bit, process affinity set via taskset).

msft 2011-02-26 07:11

27653*2^9167433+1 is prime! Time : 98054.220 sec.

llrCUDA is best match for SoB.

Karl M Johnson 2011-02-26 07:54

[QUOTE=Ken_g6;253795]Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow:[/QUOTE]
That's not all.
There are beasts like GTX 480 and GTX 580:wink:

Ralf Recker 2011-02-26 08:57

Q9550 @3.4 GHz / GTX 460 @ 725 MHz / CUDA SDK 3.1 / Driver Version 256.53

Starting Proth prime test of 19249*2^13018586+1
19249*2^13018586+1, bit: 40000 / 13018600 [0.30%]. Time per bit: 9.593 ms..

Estimated runtime: 34.7 hours

No significant change with CUDA SDK 3.2 and driver version 270.26

Honza 2011-02-26 09:14

Once there is Win build, I'll try GTX 580.
Any change of CPU usage with those huge numbers?

frmky 2011-02-26 09:16

Stock GTX 480 using CUDA 3.2

Starting Proth prime test of 19249*2^13018586+1
19249*2^13018586+1, bit: 140000 / 13018600 [1.07%]. Time per bit: 5.147 ms.

Estimated runtime: 18.6 hours.


All times are UTC. The time now is 22:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.