![]() |
|
|
#56 |
|
Bemusing Prompter
"Danny"
Dec 2002
California
23·313 Posts |
I may be wrong, but I believe stage 2 can be easily parallelized. If that's true, then the GPU firepower will really come in handy.
Last fiddled with by ixfd64 on 2013-03-03 at 01:45 |
|
|
|
|
|
#57 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32×5×7 Posts |
I have seen one case now where an exponent, 59756923 passes the round off test but show no factor. Increasing the fft correctly finds the factor. Going to have to slow it down a little.
|
|
|
|
|
|
#58 |
|
Jul 2003
So Cal
2,663 Posts |
For that case, the max error is significantly larger than the average error. Looks like an average error <
Code:
[childers@physicstitan cudapm1]$ ./CUDA-Pm1 59756923, -b1 1100 Starting Stage 1 P-1, M59756923, B1 = 1100, fft length = 3072K Doing 1637 iterations Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT. Iteration = 27 < 1000 && err = 0.50000 >= 0.35, increasing n from 3072K Starting Stage 1 P-1, M59756923, B1 = 1100, fft length = 3200K Doing 1637 iterations Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT. Iteration 100, average error = 0.08286, max error = 0.27734 Iteration 200, average error = 0.08826, max error = 0.29688 Iteration 300, average error = 0.09689, max error = 0.28125 Iteration 400, average error = 0.10635, max error = 0.30859 Iteration 500, average error = 0.11458, max error = 0.27344 Iteration 600, average error = 0.11786, max error = 0.28125 Iteration 700, average error = 0.11941, max error = 0.28125 Iteration 800, average error = 0.12019, max error = 0.28125 Iteration 900, average error = 0.12194, max error = 0.31250 Iteration 1000, average error = 0.12054 < 0.25 (max error = 0.31250), continuing test. M59756923, 0xd040e885dd81e22d, offset = 0, n = 3200K, CUDA-P-1 v0.00 Stage 1 complete, estimated total time = 0:53 M59756923 has a factor: 1 Last fiddled with by frmky on 2013-03-03 at 05:26 |
|
|
|
|
|
#59 |
|
Jul 2003
So Cal
A6716 Posts |
How is this factor found with B1=1000?
P-1 = 2^3 * 3^5 * 2551 * 60593041 * P9 * P23 Last fiddled with by frmky on 2013-03-03 at 05:07 |
|
|
|
|
|
#61 |
|
Jul 2003
So Cal
2,663 Posts |
Ah, makes sense. Thanks!
|
|
|
|
|
|
#62 | ||
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Quote:
Quote:
![]() Wowzers... I've not seen anything like that before. Perhaps a better solution would be maxerr/avgerr < 1.5 (or maybe 2)? |
||
|
|
|
|
|
#63 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32×5×7 Posts |
Ok, this should be better. 3%-4% slower, but gives correct results, even when the max error is allowed to go as high as 0.42. Also fixes the error reporting problem.
Last fiddled with by owftheevil on 2013-03-03 at 14:59 Reason: Forgot to upload the attachment |
|
|
|
|
|
#64 | |
|
Banned
"Luigi"
Aug 2002
Team Italia
5·7·139 Posts |
Quote:
1: Code:
luigi@luigi-ubuntu:~/luigi/CUDA/cudapm1-0.00$ ./CUDA-Pm1 4170308402961950452420687314125107372845632692860124825390003761727514150572517983869509135472975278394865154210790597209778982578895669768763371749038447454396115727404741278971617695528084038894140322072199744865271524521758726031117787322230290427036555791315034863880063825719334586180093, -b1 1000 Can't open workfile worktodo.txt ![]() 2: Code:
luigi@luigi-ubuntu:~/luigi/CUDA/cudapm1-0.00$ ./CUDA-Pm1 60593041, -b1 1000 Starting Stage 1 P-1, M60593041, B1 = 1000, fft length = 3200K Doing 1475 iterations Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a longer FFT. Iteration 100, average error = 0.22696, max error = 0.34664 ^C SIGINT caught, writing checkpoint.Iteration 200, average error = 0.26131, max error = 0.34241 Iteration 300, average error = 0.27237, max error = 0.33556 ^C^C SIGINT caught, writing checkpoint. SIGINT caught, writing checkpoint.Iteration 400, average error = 0.27678, max error = 0.36553 Iteration 500, average error = 0.28131, max error = 0.33734 Iteration 600, average error = 0.28373, max error = 0.33752 Iteration 700, average error = 0.28460, max error = 0.34566 Iteration 800, average error = 0.28564, max error = 0.35941 Iteration 900, average error = 0.28658, max error = 0.35676 Iteration 1000, average error = 0.28638 < 0.25 (max error = 0.36553), continuing test. Estimated time spent so far: 0:39 Luigi Last fiddled with by ET_ on 2013-03-03 at 19:06 Reason: Added code tags |
|
|
|
|
|
|
#65 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
The second is deliberate (though I forget why). It should quit immediately after the roundoff test is finished (as it seems it did).
As for the first, it's probably a printf substitution -- search for "Can't open workfile %s" or, more safely, search for "Can't open" or "Can't open workfile". |
|
|
|
|
|
#66 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32×5×7 Posts |
You are right. The way I am seeing it now, stage two naturally splits into 3 tasks that can be separated into different streams. Not sure yet how much this will speed things up and how much will have the different streams stepping on each others toes.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3628 | 2023-04-17 22:08 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |
| World's dumbest CUDA program? | xilman | Programming | 1 | 2009-11-16 10:26 |
| Factoring program need help | Citrix | Lone Mersenne Hunters | 8 | 2005-09-16 02:31 |
| Factoring program | ET_ | Programming | 3 | 2003-11-25 02:57 |