mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-02-02, 14:05   #89
msft
 
msft's Avatar
 
Jul 2009
Tokyo

26216 Posts
Default

3*2^414840-1 is prime! Time : 210.625 sec.
3*2^382449+1 is prime! Time : 190.457 sec.
3*2^414840-1 is prime! Time : 210.625 sec.
Attached Files
File Type: bz2 llrcuda.0.38.tar.bz2 (105.3 KB, 209 views)
msft is offline   Reply With Quote
Old 2011-02-02, 17:05   #90
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

llrcuda.0.39$ ./llrCUDA -d -q5*2^3569154-1
Starting Lucas Lehmer Riesel prime test of 5*2^3569154-1
Using real irrational base DWT, FFT length = 524288
V1 = 4 ; Computing U0...done.
5*2^3569154-1, iteration : 60000 / 3569154 [1.68%]. Time per iteration : 2.520 ms.
Attached Files
File Type: bz2 llrcuda.0.39.tar.bz2 (105.4 KB, 206 views)
msft is offline   Reply With Quote
Old 2011-02-03, 02:37   #91
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2×5×61 Posts
Default

llrcuda.0.40$ ./llrCUDA -q"5*2^15111112-1" -d
Starting Lucas Lehmer Riesel prime test of 5*2^15111112-1
Using real irrational base DWT, FFT length = 2097152
V1 = 4 ; Computing U0...done.
5*2^15111112-1, iteration : 20000 / 15111112 [0.13%]. Time per iteration : 10.593 ms.


llrcuda.0.40$ ./llrCUDA -q"5*2^15111113+1" -d
Starting Proth prime test of 5*2^15111113+1
5*2^15111113+1, bit: 20000 / 15111115 [0.13%]. Time per bit: 10.661 ms.


Not support save file.
Attached Files
File Type: bz2 llrcuda.0.40.tar.bz2 (105.6 KB, 209 views)
msft is offline   Reply With Quote
Old 2011-02-03, 04:30   #92
Ken_g6
 
Ken_g6's Avatar
 
Jan 2005
Caught in a sieve

18B16 Posts
Default

Quote:
Originally Posted by henryzz View Post
Quote:
Originally Posted by frmky View Post
And a larger number where the GPU shines...

Code:
./llrCUDA -d -q938237*2^3752950-1
Starting Lucas Lehmer Riesel prime test of 938237*2^3752950-1
Using real irrational base DWT, FFT length = 1048576
V1 = 4 ; Computing U0...done.
938237*2^3752950-1 is prime!  Time : 10740.747 sec.
How long would that take on a pc?
With the current Linux LLR client from PrimeGrid, on a Core 2@3GHz, I estimate about 27,000 seconds.

When checkpointing is working again, I plan to look into this for Seventeen or Bust!
Ken_g6 is offline   Reply With Quote
Old 2011-02-03, 16:50   #93
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

3·23·89 Posts
Default

Quote:
Originally Posted by Ken_g6 View Post
With the current Linux LLR client from PrimeGrid, on a Core 2@3GHz, I estimate about 27,000 seconds.

When checkpointing is working again, I plan to look into this for Seventeen or Bust!
Nice speed.
henryzz is offline   Reply With Quote
Old 2011-02-03, 19:32   #94
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA

11000100110102 Posts
Default

Finally--I got llrCUDA working on Gary's GPU! (Thanks nuggetprime for the pointers on getting the drivers installed!)

Firstly:
Code:
gary@herford:~/Desktop/gpu-stuff/llrcuda$ ./llrCUDA -d -q3*2^382449+1
Starting Proth prime test of 3*2^382449+1
3*2^382449+1 is prime!  Time : 183.300 sec.
And then, with frmky's modifications to allow it to use IBDWT:
Code:
gary@herford:~/Desktop/gpu-stuff/llrcuda/experimental$ ./llrCUDA -d -q1623*2^481564-1
Starting Lucas Lehmer Riesel prime test of 1623*2^481564-1
Using real irrational base DWT, FFT length = 131072
V1 = 3 ; Computing U0...done.
1623*2^481564-1 is prime!  Time : 377.385 sec.
gary@herford:~/Desktop/gpu-stuff/llrcuda/experimental$ ./llrCUDA -d -q1623*2^481564+1
Starting Proth prime test of 1623*2^481564+1
1623*2^481564+1 is not prime.  Proth RES64: 5B6D67B9B4648D0E  Time : 387.913 sec.
The last two both took about 200 seconds on a modern CPU, so there's definitely a slowdown due to the bigger k. But llrCUDA got the reisdual right!
mdettweiler is offline   Reply With Quote
Old 2011-02-04, 08:55   #95
msft
 
msft's Avatar
 
Jul 2009
Tokyo

10011000102 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
And then, with frmky's modifications to allow it to use IBDWT:
How do that ?

Last fiddled with by msft on 2011-02-04 at 08:59
msft is offline   Reply With Quote
Old 2011-02-04, 16:36   #96
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA

2·47·67 Posts
Default

Quote:
Originally Posted by msft View Post
How do that ?
See here:
Quote:
Originally Posted by frmky View Post
Code:
Starting Lucas Lehmer Riesel prime test of 2303*2^251634-1
Using real irrational base DWT, FFT length = 65536
V1 = 4 ; Computing U0...done.
2303*2^251634-1 is not prime.  LLR Res64: 99B680F2A92ECC27  Time : 112.600 sec.
However, to get this I had to modify gwpnumi.cu, set_fftlen(), to remove the checks fftdwt > fftzpad and bpw < 5.0. The values of these parameters are fftdwt = 65536, fftzpad = 32768, and bpw = 4.83979, so in version 0.34, both these checks fail and zero-padded rational base DWT, which runs about 20x slower, is used.

Edit: A second run:
Code:
./llrCUDA -d -q1000065*2^390927-1
Starting Lucas Lehmer Riesel prime test of 1000065*2^390927-1
Using real irrational base DWT, FFT length = 131072
V1 = 5 ; Computing U0...done.
1000065*2^390927-1, iteration : 40000 / 390927 [10.23%].  Time per iteration : 0.627 ms
1000065*2^390927-1 is prime!  Time : 246.393 sec.
Of course Jean Penné can now tell me what I have broken by doing this.

Edit 2: And a Proth test:
Code:
./llrCUDA -d -q9999*2^458051+1   
Starting Proth prime test of 9999*2^458051+1
9999*2^458051+1 is prime!  Time : 313.916 sec.
mdettweiler is offline   Reply With Quote
Old 2011-02-04, 19:53   #97
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

62410 Posts
Default Forcing IBDWT ?

Hi,

Presently, you have a strong interest that set_fftlen() chooses IBDWT FFT!
So, you may remove the fftdwt > fftzpad test... For the second test, I think you still need to verify that bpw goes not too small ; but you may have a minimum value smaller than 5.0. It is necessary to avoid entering an endless loop in which fftdwt goes larger and larger and bpw goes smaller and smaller...
Indeed, bpw must be at least one!
Also, be aware that forcing IBDWT increases the risk of getting a round off error ; so, I suggest you set the ErrorCheck=1 option.
But the very future solution is that modred() and generic_modred() does benefit from CUDA parallelism!

I hope all that may help you!

Jean
Jean Penné is offline   Reply With Quote
Old 2011-02-04, 22:07   #98
ltd
 
ltd's Avatar
 
Apr 2003

22·193 Posts
Default

Trying to get it to compile with VS2008.
So far I get it to compile but when I run it after a while I get an error in
gwpnumi.cu at line 1688:

cutilSafeCall(cudaMemcpy(x,g_x,sizeof(double)*(N),cudaMemcpyDeviceToHost));

I will try to find out what happens tomorrow.
ltd is offline   Reply With Quote
Old 2011-02-04, 22:49   #99
Ken_g6
 
Ken_g6's Avatar
 
Jan 2005
Caught in a sieve

5·79 Posts
Default

Let me guess, you're running 260.99 or higher drivers? And an "unknown error" is returned? Sounds very reminiscent of this known error on Linux. I think that's the same error that Einstein@Home is waiting on to use less CPU.

I know of two ways to deal with it. One, don't use blocking sync. Or, two, if when it errors out with that particular error you retry the memcpy again and again until it doesn't, that seems to work too.
Ken_g6 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
LLRcuda shanecruise Riesel Prime Search 8 2014-09-16 02:09
LLRCUDA - getting it to work diep GPU Computing 1 2013-10-02 12:12

All times are UTC. The time now is 15:08.


Fri Jul 7 15:08:17 UTC 2023 up 323 days, 12:36, 0 users, load averages: 0.93, 1.14, 1.14

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔