![]() |
gwpnumi.cu:
//msft #define MAXBITSPERDOUBLE (double)35 #define MAXBITSPERDOUBLE (double)44 This number deside FFT length Boundary. $ ./llrCUDA -d -q237019*2^6100018+1 -oDebug=1 k = 237019.0, b = 2, n = 6100018, c = 1, bit_length = 6100036 FFTLEN = 2097152, bpw = 3.908723, Bits per double = 40.857052, Maxbpd = 44.000000 llr compare "Bits per double" and MAXBITSPERDOUBLE. you can change MAXBITSPERDOUBLE.(on your risk):lol: |
Eureka! We have a good result with 0.59:
[code] gary@herford:~/Desktop/gpu-stuff/llrcuda$ ./llrCUDA -d -q237019*2^6100018+1 Resuming Proth prime test of 237019*2^6100018+1 at bit 1439745 [23.60%] 237019*2^6100018+1 is not prime. Proth RES64: 34687837ED148D74 Time : 63455.118 sec. [/code] It sure took its sweet time compared to earlier versions (17+ hours, if I'm correctly assuming that llrCUDA behaves like regular LLR in that it saves the Time figure when stopping and resuming) due to the larger FFT, but it got the result correct. And it still took somewhat less time than ltd's original CPU result: [code] [2010-12-24 19:44:47 WEST] Candidate: 237019*2^6100018+1 Program: llr.exe Residue: 34687837ED148D74 Time: 95109 seconds [/code] Now the big question: is there a way to recoup the speed of earlier versions without sacrificing result quality? :smile: |
1 Attachment(s)
Support affinity.
|
GTX460:
4847*2^3321063+1 is prime! Time : 8745.412 sec. 5359*2^5054502+1 is prime! Time : 26657.372 sec. 19249*2^13018586+1 is prime! Time : 139584.549 sec.:smile: |
[QUOTE=msft;253657]GTX460:
4847*2^3321063+1 is prime! Time : 8745.412 sec. 5359*2^5054502+1 is prime! Time : 26657.372 sec. 19249*2^13018586+1 is prime! Time : 139584.549 sec.:smile:[/QUOTE] Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow: |
[QUOTE=Ken_g6;253795]Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow:[/QUOTE]
Nice :smile:. Two "current" SoBs I ran in January required 5.7 days on my Q9550 @3.4 GHz (492,282.90s wall clock / 489,230.18s CPU time, 6 MB L2 cache for each task to speed things up a little bit, process affinity set via taskset). |
27653*2^9167433+1 is prime! Time : 98054.220 sec.
llrCUDA is best match for SoB. |
[QUOTE=Ken_g6;253795]Your GPU did a task at the Seventeen or Bust level in less than [B]two days?![/B] :w00t: I have got to find the time to port this to BOINC soon! :bow:[/QUOTE]
That's not all. There are beasts like GTX 480 and GTX 580:wink: |
Q9550 @3.4 GHz / GTX 460 @ 725 MHz / CUDA SDK 3.1 / Driver Version 256.53
Starting Proth prime test of 19249*2^13018586+1 19249*2^13018586+1, bit: 40000 / 13018600 [0.30%]. Time per bit: 9.593 ms.. Estimated runtime: 34.7 hours No significant change with CUDA SDK 3.2 and driver version 270.26 |
Once there is Win build, I'll try GTX 580.
Any change of CPU usage with those huge numbers? |
Stock GTX 480 using CUDA 3.2
Starting Proth prime test of 19249*2^13018586+1 19249*2^13018586+1, bit: 140000 / 13018600 [1.07%]. Time per bit: 5.147 ms. Estimated runtime: 18.6 hours. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.