mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-05-07, 04:32   #221
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

3·137 Posts
Default

It doesn't, so if you get an error, you have to start from the very beginning.

As for the error, I remember getting the same error(unknown error 30, which was explained by Oliver in an adjacent thread) for CUDALucas from time to time.
What's curious, that after that error core clock refused to go higher than 525 Mhz, while memory clock remained the same.
Could be solved by a reboot, as WDDM timeout is disabled.
So, to prevent that pesky error from happening, I had to downclock the memory of the GPU.

Last fiddled with by Karl M Johnson on 2013-05-07 at 04:38
Karl M Johnson is offline   Reply With Quote
Old 2013-05-07, 11:06   #222
Aramis Wyler
 
Aramis Wyler's Avatar
 
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA

23·53 Posts
Default

Quote:
Originally Posted by Aramis Wyler View Post
It didn't end well.
Well, good news. I ran the program again on default settings and it failed during the third ptime section again. But I looked at the error and thought that since it was the sync function, maybe the problem was related to cpu usage. I turned off prime95 and ran the thing again, and sure enough it completed:
Code:
Iteration 871000 M61262347, 0x268993cb3b899d21, n = 3360K, CUDAPm1 v0.10 err = 0.19531 (0:06 real, 5.8528 ms/iter, ETA 0:12)
Iteration 872000 M61262347, 0xdf828a4cb19fc49d, n = 3360K, CUDAPm1 v0.10 err = 0.20313 (0:06 real, 5.8514 ms/iter, ETA 0:06)
Iteration 873000 M61262347, 0x92b46441f57f0dc1, n = 3360K, CUDAPm1 v0.10 err = 0.19531 (0:06 real, 5.8502 ms/iter, ETA 0:00)
M61262347, 0xfd7ab9d857ea4a36, offset = 0, n = 3360K, CUDAPm1 v0.10
Stage 1 complete, estimated total time = 1:25:43
Starting stage 1 gcd.
M61262347 Stage 1 found no factor (P-1, B1=605000, B2=16637500, e=6, n=3360K CUDAPm1 v0.10)
Starting stage 2.
Zeros: 748288, Ones: 847712, Pairs: 172477
itime: 34.363595, transforms: 1, average: 34363.595000
ptime: 945.240834, transforms: 322686, average: 2.929290
ETA: 1:21:38
itime: 42.020420, transforms: 1, average: 42020.420000
ptime: 948.034002, transforms: 322970, average: 2.935363
ETA: 1:05:39
itime: 45.361230, transforms: 1, average: 45361.230000
ptime: 942.894161, transforms: 321866, average: 2.929462
ETA: 49:17
itime: 46.910720, transforms: 1, average: 46910.720000
ptime: 942.954856, transforms: 322050, average: 2.927977
ETA: 32:53
itime: 48.828722, transforms: 1, average: 48828.722000
ptime: 942.975542, transforms: 322794, average: 2.921292
ETA: 16:27
itime: 49.518076, transforms: 1, average: 49518.076000
ptime: 941.506243, transforms: 322458, average: 2.919780
ETA: 0:00
Stage 2 complete, estimated total time = 1:38:50
Accumulated Product: M61262347, 0xb7550a14cb5172b6, n = 3360K, CUDAPm1 v0.10
Starting stage 2 gcd.
M61262347 Stage 2 found no factor (P-1, B1=605000, B2=16637500, e=6, n=3360K CUDAPm1 v0.10)
Though I don't know if it was supposed to find a factor. :)

Last fiddled with by Aramis Wyler on 2013-05-07 at 11:17
Aramis Wyler is offline   Reply With Quote
Old 2013-05-07, 11:23   #223
Aramis Wyler
 
Aramis Wyler's Avatar
 
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA

23·53 Posts
Default

I think there might be a problem somewhere with the calculations, because looking closer I see that after cudapm1 finished the default assignment, it went on and did a number that I had put in there from one of my old pm1 assignments:

Code:
Selected B1=530000, B2=12985000, 3.11% chance of finding a factor
CUDA reports 2777M of 3072M GPU memory free.
Using e=6, d=2310, nrp=80
Using approximately 2529M GPU memory.
Starting stage 1 P-1, M61394569, B1 = 530000, B2 = 12985000, e = 6, fft length = 3360K
Doing 764962 iterations
Iteration 1000 M61394569, 0x8888b22cb0159fe4, n = 3360K, CUDAPm1 v0.10 err = 0.20703 (0:09 real, 9.1390 ms/iter, ETA 1:56:21)
Iteration 2000 M61394569, 0x22ce4679c47bde53, n = 3360K, CUDAPm1 v0.10 err = 0.19531 (0:06 real, 5.8427 ms/iter, ETA 1:14:17)
Iteration 3000 M61394569, 0x4199d13a32c43ec1, n = 3360K, CUDAPm1 v0.10 err = 0.19531 (0:06 real, 5.8484 ms/iter, ETA 1:14:16)
...
Iteration 762000 M61394569, 0x72f2c43f0662fa7d, n = 3360K, CUDAPm1 v0.10 err = 0.22949 (0:06 real, 5.8454 ms/iter, ETA 0:17)
Iteration 763000 M61394569, 0x5d768a7b9cc19fc1, n = 3360K, CUDAPm1 v0.10 err = 0.19727 (0:06 real, 5.8017 ms/iter, ETA 0:11)
Iteration 764000 M61394569, 0xa9c8c0938a1354e6, n = 3360K, CUDAPm1 v0.10 err = 0.20313 (0:05 real, 5.8006 ms/iter, ETA 0:05)
M61394569, 0xe6ed39c645d90fd3, offset = 0, n = 3360K, CUDAPm1 v0.10
Stage 1 complete, estimated total time = 1:14:26
Starting stage 1 gcd.
M61394569 Stage 1 found no factor (P-1, B1=530000, B2=12985000, e=6, n=3360K CUDAPm1 v0.10)
Starting stage 2.
Zeros: 576475, Ones: 669125, Pairs: 135475
itime: 34.168611, transforms: 1, average: 34168.611000
ptime: 742.552935, transforms: 254220, average: 2.920907
ETA: 1:04:43
itime: 41.946698, transforms: 1, average: 41946.698000
ptime: 743.830499, transforms: 254674, average: 2.920716
ETA: 52:04
itime: 45.455219, transforms: 1, average: 45455.219000
ptime: 740.867106, transforms: 253650, average: 2.920824
ETA: 39:08
itime: 46.824025, transforms: 1, average: 46824.025000
ptime: 741.681265, transforms: 253924, average: 2.920879
ETA: 26:08
itime: 48.740888, transforms: 1, average: 48740.888000
ptime: 743.663183, transforms: 254586, average: 2.921069
ETA: 13:05
itime: 49.611376, transforms: 1, average: 49611.376000
ptime: 742.008431, transforms: 254036, average: 2.920879
ETA: 0:00
Stage 2 complete, estimated total time = 1:18:41
Accumulated Product: M61394569, 0xc7cca920aa444fbe, n = 3360K, CUDAPm1 v0.10
Starting stage 2 gcd.
M61394569 Stage 2 found no factor (P-1, B1=530000, B2=12985000, e=6, n=3360K CUDAPm1 v0.10)
Problem there is that when I ran that with p95, it did find a factor:

[Tue Apr 30 13:09:28 2013]
P-1 found a factor in stage #2, B1=580000, B2=12035000, E=12.
UID: staffen/Romeo, M61394569 has a factor: 189843460261039170580823, AID: cc392de5c69eef9aeaf12ea5c839f9e7

Now, I see that in the p95 that e was 12 (vs 6 for cudapm1), but the bounds were actually smaller than with cudapm1.

Last fiddled with by Aramis Wyler on 2013-05-07 at 11:23
Aramis Wyler is offline   Reply With Quote
Old 2013-05-07, 12:14   #224
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

The first one should have found a factor. I'm testing the second exponent to check if we get the same residues. If so, there's definitely something wrong in the calculations.

Edit: On the first three iterations, the residues match but the errors are different.

Last fiddled with by owftheevil on 2013-05-07 at 12:17
owftheevil is offline   Reply With Quote
Old 2013-05-07, 13:51   #225
Stef42
 
Feb 2012
the Netherlands

5810 Posts
Default

Code:
Processing 457 - 480 of 480 relative primes
itime: 18.458630, transforms: 6906, average: 2.672840
ptime: 148.896680, transforms: 52262, average: 2.849043
ETA: 0:00
Stage 2 complete, estimated total time = 55:19
Accumulated Product: M61394569, 0xe849edfe1bbc661b, n = 3360K, CUDAPm1 v0.10
Starting stage 2 gcd.
M61394569 Stage 2 found no factor (P-1, B1=530000, B2=6890000, e=6, n=3360K CUDA
Pm1 v0.10)
Ran it myself, no factor found either.
Stef42 is offline   Reply With Quote
Old 2013-05-07, 13:59   #226
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

Stef42, if you still have the full output of that run, could you pm them to me?
owftheevil is offline   Reply With Quote
Old 2013-05-07, 14:01   #227
Stef42
 
Feb 2012
the Netherlands

5810 Posts
Default

I would love to, but I have closed the command prompt already.
Still, it only shows the last of stage 1. Do you might suggest a logging function/tool..?

Last fiddled with by Stef42 on 2013-05-07 at 14:01
Stef42 is offline   Reply With Quote
Old 2013-05-07, 14:11   #228
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Never mind then. What I realy wanted to do was compare your residues with mine. Any part towards the end of stage 1 would have been sufficient. Aramis Wyler's and mine disagree at iteration 45000.
owftheevil is offline   Reply With Quote
Old 2013-05-07, 14:24   #229
Stef42
 
Feb 2012
the Netherlands

2·29 Posts
Default

Quote:
Originally Posted by owftheevil View Post
Never mind then. What I realy wanted to do was compare your residues with mine. Any part towards the end of stage 1 would have been sufficient. Aramis Wyler's and mine disagree at iteration 45000.
I've got this exponent until iteration 50.000 run for you.
https://dl.dropboxusercontent.com/u/...m1%2050000.txt
Stef42 is offline   Reply With Quote
Old 2013-05-07, 14:26   #230
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

2×1,303 Posts
Default

hmm don't bother. ( was gonna rport the first few iteration, wich seem useless now)

Last fiddled with by firejuggler on 2013-05-07 at 14:27
firejuggler is offline   Reply With Quote
Old 2013-05-07, 14:29   #231
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32×5×7 Posts
Default

As to the cudaDevice Synchronize errors people are seeing, I'm almost convinced it is an Nvidia driver bug. On Linux, I'm getting something similar, only its a timeout error (error 6) instead of an unidentified error. Here's what I know about it.

1. It occurs only with Nvidia drivers with version number >= 300.

2. It occurs only if the card CPm1 is running on is also driving the display.

3. cufftbench sees similar errors which I have traced back to a cufftPlan1d call being unable to allocate resources.

Its as if the driver, going about its business managing the display, is interfering with cufft or some other kernel in CPm1. I need more testing on my card that is not driving the display, and I also am going to make a test version that does away with all the error checking and host synchronizing to see which call is actually failing.
owftheevil is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26
Factoring program need help Citrix Lone Mersenne Hunters 8 2005-09-16 02:31
Factoring program ET_ Programming 3 2003-11-25 02:57

All times are UTC. The time now is 08:18.


Mon Aug 2 08:18:35 UTC 2021 up 10 days, 2:47, 0 users, load averages: 2.72, 2.17, 1.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.