mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-02-01, 03:25   #78
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,mdettweiler
Quote:
Originally Posted by mdettweiler View Post
Cool! That easily covers the k-ranges tested by many of the larger projects (NPLB, RPS, PrimeGrid, even TPS to some degree).
But too slow.
1000065*2^390927-1 is prime! Time : 11376.660 sec.
3*2^303093+1 is prime! Time : 514.715 sec.
3*2^164987-1 is prime! Time : 42.509 sec.
1000065*2^220897-1 is prime! Time : 3059.852 sec.
39*2^113549+1 is prime! Time : 84.064 sec.
3*2^414840-1 is prime! Time : 210.427 sec.
9999*2^458051+1 is prime! Time : 1842.025 sec.
1000065*2^390927-1 is prime! Time : 11376.660 sec.
msft is offline   Reply With Quote
Old 2011-02-01, 03:31   #79
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA

2·47·67 Posts
Default

Quote:
Originally Posted by msft View Post
Hi ,mdettweiler

But too slow.
1000065*2^390927-1 is prime! Time : 11376.660 sec.
3*2^303093+1 is prime! Time : 514.715 sec.
3*2^164987-1 is prime! Time : 42.509 sec.
1000065*2^220897-1 is prime! Time : 3059.852 sec.
39*2^113549+1 is prime! Time : 84.064 sec.
3*2^414840-1 is prime! Time : 210.427 sec.
9999*2^458051+1 is prime! Time : 1842.025 sec.
1000065*2^390927-1 is prime! Time : 11376.660 sec.
Hmm, indeed. Interesting that the test timings get dramatically worse as k increases...any idea why that is?

BTW, how does it do on composites? As an example you may want to try it with this known residue (found and doublechecked with CPU LLR 3.8.4):
Code:
2303*2^251634-1 is not prime.  Res64: 99B680F2A92ECC27  Time : 991.0 sec.
2303*2^251634-1 is not prime.  LLR Res64: 99B680F2A92ECC27  Time : 96.922 sec.
(Note that the second of the above two results is more indicative of how long it takes to do this test on a typical CPU. The first test was done on one of NPLB's LLRnet servers with an old, rather slow machine, from the looks of it.)

Last fiddled with by mdettweiler on 2011-02-01 at 03:31
mdettweiler is offline   Reply With Quote
Old 2011-02-01, 07:22   #80
Jean Penné
 
Jean Penné's Avatar
 
May 2004
FRANCE

11608 Posts
Default Remarks about llrpi timings

Hi,

Here are some precision about how llrpi works, related to the size of k :

- For k's up to 22 bits large (defined by MAXKBITS constant), IBDWT is used, so, the FFT length is optimal, and the modular reduction is free.

- For larger k's, up to 36 bits on WIN 32, 45 bits on Linux, Zero padded FFT is used, and the modular reduction is done by the "modred()" function.

- For still larger k's, or if the base is not two, generic modular reduction is used, which requires three multiplications in place of one...

I suggest you set -oDebug=1 to see the details about the time consumed in multiplications, normalization and modular reduction.

Congrat for your nice work and Best Regards,

Jean
Jean Penné is offline   Reply With Quote
Old 2011-02-01, 12:56   #81
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi ,mdettweiler
We need something.

Hi ,Jean Penné
Thank you information,

3*2^414840-1 is prime! Time : 209.799 sec.
3*2^303093+1 is prime! Time : 107.206 sec.
3*2^164987-1 is prime! Time : 42.285 sec.
39*2^113549+1 is prime! Time : 30.923 sec.
3*2^414840-1 is prime! Time : 209.799 sec.
Attached Files
File Type: bz2 llrcuda.0.34.tar.bz2 (108.3 KB, 236 views)
msft is offline   Reply With Quote
Old 2011-02-02, 01:16   #82
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

266310 Posts
Default

Code:
Starting Lucas Lehmer Riesel prime test of 2303*2^251634-1
Using real irrational base DWT, FFT length = 65536
V1 = 4 ; Computing U0...done.
2303*2^251634-1 is not prime.  LLR Res64: 99B680F2A92ECC27  Time : 112.600 sec.
However, to get this I had to modify gwpnumi.cu, set_fftlen(), to remove the checks fftdwt > fftzpad and bpw < 5.0. The values of these parameters are fftdwt = 65536, fftzpad = 32768, and bpw = 4.83979, so in version 0.34, both these checks fail and zero-padded rational base DWT, which runs about 20x slower, is used.

Edit: A second run:
Code:
./llrCUDA -d -q1000065*2^390927-1
Starting Lucas Lehmer Riesel prime test of 1000065*2^390927-1
Using real irrational base DWT, FFT length = 131072
V1 = 5 ; Computing U0...done.
1000065*2^390927-1, iteration : 40000 / 390927 [10.23%].  Time per iteration : 0.627 ms
1000065*2^390927-1 is prime!  Time : 246.393 sec.
Of course Jean Penné can now tell me what I have broken by doing this.

Edit 2: And a Proth test:
Code:
./llrCUDA -d -q9999*2^458051+1   
Starting Proth prime test of 9999*2^458051+1
9999*2^458051+1 is prime!  Time : 313.916 sec.

Last fiddled with by frmky on 2011-02-02 at 02:05
frmky is offline   Reply With Quote
Old 2011-02-02, 04:08   #83
Ken_g6
 
Ken_g6's Avatar
 
Jan 2005
Caught in a sieve

1100010112 Posts
Default

I also notice that v0.34 is only about half as fast as v0.16 when testing 5*2^1282755+1. Any chance of reintroducing that code for certain values of K or something?
Ken_g6 is offline   Reply With Quote
Old 2011-02-02, 04:26   #84
msft
 
msft's Avatar
 
Jul 2009
Tokyo

11428 Posts
Default

Quote:
Originally Posted by Ken_g6 View Post
I also notice that v0.34 is only about half as fast as v0.16 when testing 5*2^1282755+1. Any chance of reintroducing that code for certain values of K or something?
Yes.Not Yet.
Now I try Riesel Prime test tuning.
Deng Xiaoping say "Let some people get rich first."
msft is offline   Reply With Quote
Old 2011-02-02, 05:18   #85
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,663 Posts
Default

And a larger number where the GPU shines...

Code:
./llrCUDA -d -q938237*2^3752950-1
Starting Lucas Lehmer Riesel prime test of 938237*2^3752950-1
Using real irrational base DWT, FFT length = 1048576
V1 = 4 ; Computing U0...done.
938237*2^3752950-1 is prime!  Time : 10740.747 sec.
frmky is offline   Reply With Quote
Old 2011-02-02, 06:53   #86
msft
 
msft's Avatar
 
Jul 2009
Tokyo

10011000102 Posts
Default

Hi ,frmky
Quote:
Originally Posted by frmky View Post
And a larger number where the GPU shines...
modred(),generic_modred() need Parallel code.
msft is offline   Reply With Quote
Old 2011-02-02, 07:37   #87
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

3×23×89 Posts
Default

Quote:
Originally Posted by frmky View Post
And a larger number where the GPU shines...

Code:
./llrCUDA -d -q938237*2^3752950-1
Starting Lucas Lehmer Riesel prime test of 938237*2^3752950-1
Using real irrational base DWT, FFT length = 1048576
V1 = 4 ; Computing U0...done.
938237*2^3752950-1 is prime!  Time : 10740.747 sec.
How long would that take on a pc?
henryzz is offline   Reply With Quote
Old 2011-02-02, 11:24   #88
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2×5×61 Posts
Default

Quote:
Originally Posted by Ken_g6 View Post
I also notice that v0.34 is only about half as fast as v0.16 when testing 5*2^1282755+1. Any chance of reintroducing that code for certain values of K or something?
Recognize this problem.
llrpi380devsrc.zip change FFT length.
msft is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
LLRcuda shanecruise Riesel Prime Search 8 2014-09-16 02:09
LLRCUDA - getting it to work diep GPU Computing 1 2013-10-02 12:12

All times are UTC. The time now is 15:07.


Fri Jul 7 15:07:17 UTC 2023 up 323 days, 12:35, 0 users, load averages: 0.94, 1.17, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔