mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-03-03, 01:43   #56
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

23·313 Posts
Default

I may be wrong, but I believe stage 2 can be easily parallelized. If that's true, then the GPU firepower will really come in handy.

Last fiddled with by ixfd64 on 2013-03-03 at 01:45
ixfd64 is offline   Reply With Quote
Old 2013-03-03, 03:59   #57
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

I have seen one case now where an exponent, 59756923 passes the round off test but show no factor. Increasing the fft correctly finds the factor. Going to have to slow it down a little.
owftheevil is offline   Reply With Quote
Old 2013-03-03, 04:55   #58
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,663 Posts
Default

For that case, the max error is significantly larger than the average error. Looks like an average error < 0.15 0.1 might be an appropriate check with some safety margin?

Code:
[childers@physicstitan cudapm1]$ ./CUDA-Pm1 59756923, -b1 1100

Starting Stage 1 P-1, M59756923, B1 = 1100, fft length = 3072K
Doing 1637 iterations
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT.
Iteration = 27 < 1000 && err = 0.50000 >= 0.35, increasing n from 3072K
Starting Stage 1 P-1, M59756923, B1 = 1100, fft length = 3200K
Doing 1637 iterations
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT.
Iteration  100, average error = 0.08286, max error = 0.27734
Iteration  200, average error = 0.08826, max error = 0.29688
Iteration  300, average error = 0.09689, max error = 0.28125
Iteration  400, average error = 0.10635, max error = 0.30859
Iteration  500, average error = 0.11458, max error = 0.27344
Iteration  600, average error = 0.11786, max error = 0.28125
Iteration  700, average error = 0.11941, max error = 0.28125
Iteration  800, average error = 0.12019, max error = 0.28125
Iteration  900, average error = 0.12194, max error = 0.31250
Iteration 1000, average error = 0.12054 < 0.25 (max error = 0.31250), continuing test.
M59756923, 0xd040e885dd81e22d, offset = 0, n = 3200K, CUDA-P-1 v0.00
Stage 1 complete, estimated total time = 0:53
M59756923 has a factor: 1

Last fiddled with by frmky on 2013-03-03 at 05:26
frmky is offline   Reply With Quote
Old 2013-03-03, 05:07   #59
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

A6716 Posts
Default

Quote:
Originally Posted by owftheevil View Post
./CUDA-pm1 60593041, -b1 1000, [-f 3360k]
How is this factor found with B1=1000?
P-1 = 2^3 * 3^5 * 2551 * 60593041 * P9 * P23

Last fiddled with by frmky on 2013-03-03 at 05:07
frmky is offline   Reply With Quote
Old 2013-03-03, 07:32   #60
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
San Diego, Calif.

32×7×163 Posts
Default

It's composite, so that's ok.
Code:
2105528336291622770155712978260232660484461209 = 
969488657 * 19874517449 * 2208979902697 * 49468643416729
each of which passes with B1=41 (!).
There are more factors known for this Mp
Batalov is offline   Reply With Quote
Old 2013-03-03, 07:34   #61
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,663 Posts
Default

Ah, makes sense. Thanks!
frmky is offline   Reply With Quote
Old 2013-03-03, 10:51   #62
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by owftheevil View Post
I'm not sure, but I think Dubslow is responsible for the roundoff test part. Its hard to tell who did what on CuLu.

Edit: I didn't see the PS. I have so far been too lazy to make a different message for no factor found. I was thinking of just adding "but you already knew that, didn't you."
I added a bit, but msft had the gist of it.

Quote:
Originally Posted by henryzz View Post
I can see that cpus are going to become obsolete for P-1 stage 1 soon. This should help kill the P-1 deficit.
I wouldn't quite go that far yet, especially w.r.t. stage 2.

Quote:
Originally Posted by frmky View Post
For that case, the max error is significantly larger than the average error. Looks like an average error < 0.15 0.1 might be an appropriate check with some safety margin?
Wowzers... I've not seen anything like that before. Perhaps a better solution would be maxerr/avgerr < 1.5 (or maybe 2)?
Dubslow is offline   Reply With Quote
Old 2013-03-03, 14:58   #63
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

Ok, this should be better. 3%-4% slower, but gives correct results, even when the max error is allowed to go as high as 0.42. Also fixes the error reporting problem.
Attached Files
File Type: zip CUDALucas.cu.zip (15.5 KB, 227 views)

Last fiddled with by owftheevil on 2013-03-03 at 14:59 Reason: Forgot to upload the attachment
owftheevil is offline   Reply With Quote
Old 2013-03-03, 18:59   #64
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

5·7·139 Posts
Default

Quote:
Originally Posted by owftheevil View Post
Ok, this should be better. 3%-4% slower, but gives correct results, even when the max error is allowed to go as high as 0.42. Also fixes the error reporting problem.
With this update I got a couple of strange behaviors:

1:
Code:
luigi@luigi-ubuntu:~/luigi/CUDA/cudapm1-0.00$ ./CUDA-Pm1 4170308402961950452420687314125107372845632692860124825390003761727514150572517983869509135472975278394865154210790597209778982578895669768763371749038447454396115727404741278971617695528084038894140322072199744865271524521758726031117787322230290427036555791315034863880063825719334586180093, -b1 1000

Can't open workfile worktodo.txt
I suppose that the P-1 program actually only works with mersenne exponents, and that's not a bad thing, but the search for the string "Can't open workfile worktodo.txt" on the source code reported null


2:
Code:
luigi@luigi-ubuntu:~/luigi/CUDA/cudapm1-0.00$ ./CUDA-Pm1 60593041, -b1 1000

Starting Stage 1 P-1, M60593041, B1 = 1000, fft length = 3200K
Doing 1475 iterations
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT.
Iteration  100, average error = 0.22696, max error = 0.34664
^C	SIGINT caught, writing checkpoint.Iteration  200, average error = 0.26131, max error = 0.34241
Iteration  300, average error = 0.27237, max error = 0.33556
^C^C	SIGINT caught, writing checkpoint.	SIGINT caught, writing checkpoint.Iteration  400, average error = 0.27678, max error = 0.36553
Iteration  500, average error = 0.28131, max error = 0.33734
Iteration  600, average error = 0.28373, max error = 0.33752
Iteration  700, average error = 0.28460, max error = 0.34566
Iteration  800, average error = 0.28564, max error = 0.35941
Iteration  900, average error = 0.28658, max error = 0.35676
Iteration 1000, average error = 0.28638 < 0.25 (max error = 0.36553), continuing test.
 Estimated time spent so far: 0:39
i.e. the ctrl-C is trapped during the error-checking routine, but is not passed to the program.

Luigi

Last fiddled with by ET_ on 2013-03-03 at 19:06 Reason: Added code tags
ET_ is offline   Reply With Quote
Old 2013-03-04, 00:22   #65
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

The second is deliberate (though I forget why). It should quit immediately after the roundoff test is finished (as it seems it did).

As for the first, it's probably a printf substitution -- search for "Can't open workfile %s" or, more safely, search for "Can't open" or "Can't open workfile".
Dubslow is offline   Reply With Quote
Old 2013-03-06, 00:09   #66
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
I may be wrong, but I believe stage 2 can be easily parallelized. If that's true, then the GPU firepower will really come in handy.
You are right. The way I am seeing it now, stage two naturally splits into 3 tasks that can be separated into different streams. Not sure yet how much this will speed things up and how much will have the different streams stepping on each others toes.
owftheevil is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3628 2023-04-17 22:08
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26
Factoring program need help Citrix Lone Mersenne Hunters 8 2005-09-16 02:31
Factoring program ET_ Programming 3 2003-11-25 02:57

All times are UTC. The time now is 15:18.


Fri Jul 7 15:18:55 UTC 2023 up 323 days, 12:47, 0 users, load averages: 1.17, 1.13, 1.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔