mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-02-27, 15:18   #23
R.D. Silverman
 
R.D. Silverman's Avatar
 
"Bob Silverman"
Nov 2003
North of Boston

22×1,877 Posts
Default

Quote:
Originally Posted by ET_ View Post
I've been thinking about a CUDA program for P-1 factoring for quite a bit, and think that many other Mersennaries had.

First of all, note that I have only a limited knowledge about the math involved, but I'm willing to expand this limitation studying under the guide of more informed people, and eventually start coding something with the ir help.
Read:
P. Montgomery & R.D. Silverman
An FFT Extension to the P-1 Factoring Algorithm
Math. Comp.

It is available on the net. It should tell you everything you need to know.

Quote:

I'd like to gather ideas about how such a program should be designed. Some questions will be trivial, some other maybe deeper, but all of them will be enclosed in this thread.

Some naif subjects to talk about:

- Parallelization of tasks
- Limitations due to the memory factor of the GPU (how far may we go having 0.5, 1, 2,3 or 6GB of memory?
- Description of steps 1 and 2 (from MersenneWiki I got a grasp of it, but a talk would explain more).
It discusses all of this.

Quote:
- Is a parallel Montgomery multiplication algorithm out of question for such algorithm?
It would be OK for step 1. It is not relevant to step 2.
R.D. Silverman is offline   Reply With Quote
Old 2013-02-27, 15:32   #24
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

3×1,619 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
Read:
P. Montgomery & R.D. Silverman
An FFT Extension to the P-1 Factoring Algorithm
Math. Comp.

It is available on the net. It should tell you everything you need to know.



It discusses all of this.



It would be OK for step 1. It is not relevant to step 2.
Thank you very much for the information!

I'm going to study that paper and eventually will come back for more questions.

Luigi
ET_ is offline   Reply With Quote
Old 2013-02-28, 13:18   #25
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

1001110112 Posts
Default

Output from proto-CUDA-P-1:

Code:
filbert@filbert:~/Build/cudapm1-0.00$ ./CUDA-P-1 1594559

Starting Stage 1 P-1, M1594559, B1 = 1, fft length = 96K
Stage 1 complete, estimated total time = 0:00
3189119 is a factor of M1594559
Of course with B1 = 1 its easy.
owftheevil is offline   Reply With Quote
Old 2013-02-28, 14:58   #26
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

3·1,619 Posts
Default

Quote:
Originally Posted by owftheevil View Post
Output from proto-CUDA-P-1:

Code:
filbert@filbert:~/Build/cudapm1-0.00$ ./CUDA-P-1 1594559

Starting Stage 1 P-1, M1594559, B1 = 1, fft length = 96K
Stage 1 complete, estimated total time = 0:00
3189119 is a factor of M1594559
Of course with B1 = 1 its easy.
It found a factor with B1=1 in "no time"!
Obviously it had k=1, but what the hell, it works!!!

Luigi

Last fiddled with by ET_ on 2013-02-28 at 14:59 Reason: Added the "no time" joke
ET_ is offline   Reply With Quote
Old 2013-02-28, 15:29   #27
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

1001110112 Posts
Default

Well, it only has to do 21 squares and a few multiply by 3s with a 96k fft.
The output time is calibrated in seconds, so less than 1 second won't show up.

Last fiddled with by owftheevil on 2013-02-28 at 15:35 Reason: Confusion and inability to spell correctly
owftheevil is offline   Reply With Quote
Old 2013-02-28, 15:31   #28
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by henryzz View Post
Run the following code through pfgw:
Code:
SCRIPT

DIM expo, 31
DIM base, 3
DIM max_n, 100
DIM min_n, 1
OPENFILEAPP r_file,results.txt



DIM n, min_n
DIM result, 0
DIM mers, 2^expo-1
DIMS rstr



LABEL next_n
POWMOD result,base,n,mers
WRITE r_file,result
SET n, n+1
IF (n<=max_n) THEN GOTO next_n

LABEL end

Thanks for the script.
owftheevil is offline   Reply With Quote
Old 2013-03-02, 04:20   #29
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

31510 Posts
Default

Stage 1 seems to be working ok.

Code:
filbert@filbert:~/Build/cudapm1-0.00$ ./CUDALucas 60981299 -f 3360k

Starting Stage 1 P-1, M60981299, B1 = 580000, fft length = 3360K
Stage 1 complete, estimated total time = 1:34:14
7124551590389340568394253966215081 is a factor of M60981299
To put the elapsed time in context, this is on a 570 that completes a 28xxxxxx double check in 21:00--21:30 hours.
owftheevil is offline   Reply With Quote
Old 2013-03-02, 13:00   #30
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

113718 Posts
Default

Quote:
Originally Posted by owftheevil View Post
Stage 1 seems to be working ok.

Code:
filbert@filbert:~/Build/cudapm1-0.00$ ./CUDALucas 60981299 -f 3360k

Starting Stage 1 P-1, M60981299, B1 = 580000, fft length = 3360K
Stage 1 complete, estimated total time = 1:34:14
7124551590389340568394253966215081 is a factor of M60981299
To put the elapsed time in context, this is on a 570 that completes a 28xxxxxx double check in 21:00--21:30 hours.
KUDOS!

Should you need a tester with a GTX580, I'm here to help.

I still can't understand the "estimated total time" (1:34:14) thingie. Do you mean thet the code completes stage 1 at B1=580,000 in 1 hour and 34 minutes the DC assignment in 21 hours with CuLu?

Which version of CuFFT are you using?


Luigi

Last fiddled with by ET_ on 2013-03-02 at 13:03 Reason: Thinking serially is a bad thing...
ET_ is offline   Reply With Quote
Old 2013-03-02, 13:11   #31
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2B4D16 Posts
Default

Quote:
Originally Posted by ET_ View Post
KUDOS!
+1!!!

Quote:
Originally Posted by ET_ View Post
Should you need a tester with a GTX580, I'm here to help.
Ditto. I only have a 560, but it is the 2GB version, so when you need some Stage 2 testing....
chalsall is offline   Reply With Quote
Old 2013-03-02, 13:13   #32
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

37×163 Posts
Default

Quote:
Originally Posted by ET_ View Post
KUDOS!

Should you need a tester with a GTX580, I'm here to help.

I still can't understand the "estimated total time" (1:34:14) thingie. Do you mean thet the code completes stage 1 at B1=580,000 in 1 hour and 34 minutes the DC assignment in 21 hours with CuLu?

Which version of CuFFT are you using?


Luigi
Stage 1 of a ~61M exponent in 1:34:14 and a DC of a ~28M exponent.


I am intrigued how fast this would run small numbers upto a much higher B1. How much does the exponent affect the runtime? Could you try a sub 1000-bit exponent? Maybe a range of different size exponents would be appropriate.
henryzz is offline   Reply With Quote
Old 2013-03-02, 13:35   #33
firejuggler
 
firejuggler's Avatar
 
"Vincent"
Apr 2010
Over the rainbow

22×7×103 Posts
Default

1H and 34 min? that's 5 to 7 time faster than a ordinary CPU. nice.

Last fiddled with by firejuggler on 2013-03-02 at 13:39
firejuggler is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3622 2023-01-25 16:41
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26
Factoring program need help Citrix Lone Mersenne Hunters 8 2005-09-16 02:31
Factoring program ET_ Programming 3 2003-11-25 02:57

All times are UTC. The time now is 09:14.


Sat Feb 4 09:14:49 UTC 2023 up 170 days, 6:43, 1 user, load averages: 0.77, 1.02, 0.97

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔