mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-11-19, 22:05   #650
Stef42
 
Feb 2012
the Netherlands

2×29 Posts
Default

I have some issues getting Stage 2 going with the 0.22 version. It starts filling the GPU memory all the way to 9200 mb, then juist quits (CMD window closes). I'm using a GTX 1080 Ti with 11GB memory. (Windows 10 Home x64, driver 411.70)

CMD output:
Quote:
No GeForceGTX1080Ti_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.
CUDA reports 9312M of 11264M GPU memory free.
Using threads: norm1 512, mult 256, norm2 512.
No stage 2 checkpoint.
Using up to 9200M GPU memory.
Selected B1=905000, B2=19683750, 3.49% chance of finding a factor
Using B1 = 905000 from savefile.
Continuing stage 2 from a partial result of M89326001 fft length = 5120K
Starting stage 2.
Using b1 = 905000, b2 = 19683750, d = 840, e = 12, nrp = 192
Stef42 is offline   Reply With Quote
Old 2018-11-20, 05:25   #651
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

Quote:
Originally Posted by Stef42 View Post
...then juist quits (CMD window closes)
If you're running it by double-clicking the exe then any message it may give when it terminates would be unfortunately lost. If you open a command prompt first and then run the program, any final error message output (if any) would remain visible.
James Heinrich is offline   Reply With Quote
Old 2018-11-20, 07:33   #652
Stef42
 
Feb 2012
the Netherlands

2·29 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
If you're running it by double-clicking the exe then any message it may give when it terminates would be unfortunately lost. If you open a command prompt first and then run the program, any final error message output (if any) would remain visible.
Tried that, no message what so ever. It just terminates.
Stef42 is offline   Reply With Quote
Old 2018-11-20, 08:16   #653
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by Stef42 View Post
Tried that, no message what so ever. It just terminates.
That's not unusual for CUDAPm1 v0.20, even with console redirection to a file. As I recall the original author owftheevil posted about certain error cases terminating with no message. In my notes, post 373 2013-09-23 win64 cuda5.5 version attached, discussion of fftbench parameters & threadbench.
"excessive stage 2 round-off errors simply halt the program without error messages."
"there could be some inefficient fft lengths that I haven't looked at yet, which will cause a test to terminate with an excessive round-off error."
https://www.mersenneforum.org/showpo...&postcount=373
The memory filling to 9.2GB on a mere 90m exponent is news.
On Quadro 2000,V0.20, I had issues completing exponents at 85m on one unit and not another. Also at 171m.

Last fiddled with by kriesel on 2018-11-20 at 08:26
kriesel is offline   Reply With Quote
Old 2018-11-20, 17:38   #654
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by Stef42 View Post
I have some issues getting Stage 2 going with the 0.22 version. It starts filling the GPU memory all the way to 9200 mb, then juist quits (CMD window closes). I'm using a GTX 1080 Ti with 11GB memory. (Windows 10 Home x64, driver 411.70)

CMD output:
Interesting, and a possible new issue.

That exponent 89326001 has no P-1 assignment listed and is not available for assignment. https://www.mersenne.org/report_expo...exp_hi=&full=1
I could try it here for confirmation and maybe isolation of what environment(s) it occurs in. What was the worktodo entry for it? I suspect it was something like
PFactor=1,2,89326001,-1,76,2
kriesel is offline   Reply With Quote
Old 2018-11-20, 19:31   #655
Stef42
 
Feb 2012
the Netherlands

2·29 Posts
Default

Quote:
Originally Posted by kriesel View Post
Interesting, and a possible new issue.

That exponent 89326001 has no P-1 assignment listed and is not available for assignment. https://www.mersenne.org/report_expo...exp_hi=&full=1
I could try it here for confirmation and maybe isolation of what environment(s) it occurs in. What was the worktodo entry for it? I suspect it was something like
PFactor=1,2,89326001,-1,76,2
I have reserved the exponent through GPU72.com.
Worktodo does indeed look like this:

Quote:
Pfactor=N/A,1,2,89326001,-1,76,2
A few assignments were completed from GPU72.com before this one. Funny thing was dat similar exponents in the 89M range only used roughly 4300MB memory.

Last fiddled with by Stef42 on 2018-11-20 at 19:32
Stef42 is offline   Reply With Quote
Old 2018-11-20, 23:35   #656
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by Stef42 View Post
I have some issues getting Stage 2 going with the 0.22 version. It starts filling the GPU memory all the way to 9200 mb, then juist quits (CMD window closes). I'm using a GTX 1080 Ti with 11GB memory. (Windows 10 Home x64, driver 411.70)
FYI, it completed ok here on Win7 x64 CUDA5.5 build V0.20, driver 378.78 in about 2 hours on a GTX 1080 Ti. I'll try closer to your case later.

Code:
CUDA reports 10988M of 11264M GPU memory free.
Index 55
Using threads: norm1 32, mult 32, norm2 32.
Using up to 4374M GPU memory.
Selected B1=770000, B2=18672500, 3.37% chance of finding a factor
Starting stage 1 P-1, M89326001, B1 = 770000, B2 = 18672500, fft length = 5184K
...
M89326001 Stage 2 found no factor (P-1, B1=770000, B2=18672500, e=4, n=5184K CUDAPm1 v0.20)
kriesel is offline   Reply With Quote
Old 2018-11-21, 03:04   #657
aaronhaviland
 
Jan 2011
Dudley, MA, USA

73 Posts
Default

Quote:
Originally Posted by Stef42 View Post
I have some issues getting Stage 2 going with the 0.22 version. It starts filling the GPU memory all the way to 9200 mb, then juist quits (CMD window closes). I'm using a GTX 1080 Ti with 11GB memory. (Windows 10 Home x64, driver 411.70)
In prior windows releases, this program would not make use of more than 4GiB video ram. I released that restriction for this build, because I found no issues with it on my 8GiB RTX 2070. The only other cards I had available were 3GiB and 2GiB, so I didn't bother trying them.

Noticing that you have a device with 11GiB, I'm very curious to find out if there was another reason for this limitation that I hadn't been able to determine. Especially since you mention it "starts filling the GPU memory", which it's trying to malloc, and failing.

If you could please do me a favour and fiddle with the UnusedMem value in the .ini file, and see if you can determine a value that doesn't crash. I would start with a value something like 7168, as that would simulate the old 4GiB limitation. (11GiB - 4GiB = 7GiB * 1024 = 7168)

Last fiddled with by aaronhaviland on 2018-11-21 at 03:25
aaronhaviland is offline   Reply With Quote
Old 2018-11-21, 04:29   #658
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default First V0.22 try

Interesting benchmarking, followed by a silent halt.

it was an attempt to continue a run that had a silent halt in v0.20. V0.22 did too.
Code:
CUDAPm1 v0.22
Warning: Couldn't find or parse ini file option UnusedMem; using default 100MiB.
------- DEVICE 0 -------
name                GeForce GTX 1080 Ti
Compatibility       6.1
clockRate (MHz)     1620
memClockRate (MHz)  5505
totalGlobalMem      11811160064
totalConstMem       65536
l2CacheSize         2883584
sharedMemPerBlock   49152
regsPerBlock        65536
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     2048
multiProcessorCount 28
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      2147483647,65535,65535
textureAlignment    512
deviceOverlap       1

No GeForceGTX1080Ti_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.
CUDA reports 10988M of 11264M GPU memory free.
No GeForceGTX1080Ti_threads.txt file found. Running benchmark.
CUDA bench, testing various thread sizes for fft 23040K, doing 15 passes.
fft size = 23040K, square time = 0.0000 msec, threads 32
fft size = 23040K, square time = 0.0000 msec, threads 64
fft size = 23040K, square time = 1.4538 msec, threads 128
fft size = 23040K, square time = 1.4513 msec, threads 256
fft size = 23040K, square time = 1.4494 msec, threads 512
fft size = 23040K, square time = 1.4492 msec, threads 1024

Best square time for fft = 23040K, time: 0.0000, t = 64

fft size = 23040K, ave time = 0.1932 msec, Norm1 threads 32, Norm2 threads 32
fft size = 23040K, ave time = 0.2154 msec, Norm1 threads 32, Norm2 threads 64
fft size = 23040K, ave time = 0.2240 msec, Norm1 threads 32, Norm2 threads 128
fft size = 23040K, ave time = 0.2248 msec, Norm1 threads 32, Norm2 threads 256
fft size = 23040K, ave time = 0.2358 msec, Norm1 threads 32, Norm2 threads 512
fft size = 23040K, ave time = 0.2438 msec, Norm1 threads 32, Norm2 threads 1024
fft size = 23040K, ave time = 0.1219 msec, Norm1 threads 64, Norm2 threads 32
fft size = 23040K, ave time = 0.1329 msec, Norm1 threads 64, Norm2 threads 64
fft size = 23040K, ave time = 0.1421 msec, Norm1 threads 64, Norm2 threads 128
fft size = 23040K, ave time = 0.1421 msec, Norm1 threads 64, Norm2 threads 256
fft size = 23040K, ave time = 0.1437 msec, Norm1 threads 64, Norm2 threads 512
fft size = 23040K, ave time = 0.1453 msec, Norm1 threads 64, Norm2 threads 1024
fft size = 23040K, ave time = 0.0589 msec, Norm1 threads 128, Norm2 threads 32
fft size = 23040K, ave time = 0.0648 msec, Norm1 threads 128, Norm2 threads 64
fft size = 23040K, ave time = 0.0693 msec, Norm1 threads 128, Norm2 threads 128
fft size = 23040K, ave time = 0.0687 msec, Norm1 threads 128, Norm2 threads 256
fft size = 23040K, ave time = 0.0689 msec, Norm1 threads 128, Norm2 threads 512
fft size = 23040K, ave time = 0.0684 msec, Norm1 threads 128, Norm2 threads 1024
fft size = 23040K, ave time = 1.7076 msec, Norm1 threads 256, Norm2 threads 32
fft size = 23040K, ave time = 1.7102 msec, Norm1 threads 256, Norm2 threads 64
fft size = 23040K, ave time = 1.7152 msec, Norm1 threads 256, Norm2 threads 128
fft size = 23040K, ave time = 1.7102 msec, Norm1 threads 256, Norm2 threads 256
fft size = 23040K, ave time = 1.7119 msec, Norm1 threads 256, Norm2 threads 512
fft size = 23040K, ave time = 1.7096 msec, Norm1 threads 256, Norm2 threads 1024
fft size = 23040K, ave time = 1.6909 msec, Norm1 threads 512, Norm2 threads 32
fft size = 23040K, ave time = 1.6939 msec, Norm1 threads 512, Norm2 threads 64
fft size = 23040K, ave time = 1.6924 msec, Norm1 threads 512, Norm2 threads 128
fft size = 23040K, ave time = 1.6930 msec, Norm1 threads 512, Norm2 threads 256
fft size = 23040K, ave time = 1.6909 msec, Norm1 threads 512, Norm2 threads 512
fft size = 23040K, ave time = 1.6869 msec, Norm1 threads 512, Norm2 threads 1024

Average time for fft= 23040K, all threads variations 0.7659 msec, threshold value for valid timings set to 0.7500 of this, 0.5744 msec
Warning, time for fft = 23040K, time: 0.1932 msec, t1 = 32, t2 = 64, t3 = 32 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.2154 msec, t1 = 32, t2 = 64, t3 = 64 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.2240 msec, t1 = 32, t2 = 64, t3 = 128 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.2248 msec, t1 = 32, t2 = 64, t3 = 256 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.2358 msec, t1 = 32, t2 = 64, t3 = 512 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.2438 msec, t1 = 32, t2 = 64, t3 = 1024 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1219 msec, t1 = 64, t2 = 64, t3 = 32 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1329 msec, t1 = 64, t2 = 64, t3 = 64 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1421 msec, t1 = 64, t2 = 64, t3 = 128 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1421 msec, t1 = 64, t2 = 64, t3 = 256 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1437 msec, t1 = 64, t2 = 64, t3 = 512 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.1453 msec, t1 = 64, t2 = 64, t3 = 1024 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0589 msec, t1 = 128, t2 = 64, t3 = 32 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0648 msec, t1 = 128, t2 = 64, t3 = 64 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0693 msec, t1 = 128, t2 = 64, t3 = 128 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0687 msec, t1 = 128, t2 = 64, t3 = 256 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0689 msec, t1 = 128, t2 = 64, t3 = 512 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Warning, time for fft = 23040K, time: 0.0684 msec, t1 = 128, t2 = 64, t3 = 1024 is below threshold 0.5744 msec (0.7500 of average 0.7659)
Timings below threshold were detected for 18 norm1 / mult / norm2 combinations for fft length 23040K and omitted from consideration for best.

Best time for fft = 23040K, time: 1.6869, t1 = 512, t2 = 64, t3 = 1024
Using threads: norm1 512, mult 128, norm2 128.
No stage 2 checkpoint.
Using up to 10800M GPU memory.
Selected B1=3965000, B2=100116250, 4.25% chance of finding a factor
Using B1 = 3310000 from savefile.
Continuing stage 2 from a partial result of M400001387 fft length = 23040K
Starting stage 2.
batch wrapper reports exit at Tue 11/20/2018 22:03:21.82
Corresponding benchmark numbers in v0.20 are
23040 411074273 16.5434
23040 32 32 32 17.7388
why so different in v0.22?

Last fiddled with by kriesel on 2018-11-21 at 05:27
kriesel is offline   Reply With Quote
Old 2018-11-21, 07:00   #659
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by Stef42 View Post
I have some issues getting Stage 2 going with the 0.22 version. It starts filling the GPU memory all the way to 9200 mb, then juist quits (CMD window closes). I'm using a GTX 1080 Ti with 11GB memory. (Windows 10 Home x64, driver 411.70)
Win64 CUDAPm1 V0.22 CUDA 8.0 on Windows 7 Pro & driver 378.78, program picked lower B1 and B2, occupied 10.7GB, on GTX 1080 Ti, ran to completion.
Code:
batch wrapper reports (re)launch at Tue 11/20/2018 22:43:27.36 reset count 0 of max 3 
CUDAPm1 v0.22
------- DEVICE 0 -------
name                GeForce GTX 1080 Ti
Compatibility       6.1
clockRate (MHz)     1620
memClockRate (MHz)  5505
totalGlobalMem      11811160064
totalConstMem       65536
l2CacheSize         2883584
sharedMemPerBlock   49152
regsPerBlock        65536
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     2048
multiProcessorCount 28
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      2147483647,65535,65535
textureAlignment    512
deviceOverlap       1

CUDA reports 10988M of 11264M GPU memory free.
No entry for fft = 5184k found. Running benchmark.
CUDA bench, testing various thread sizes for fft 5184K, doing 15 passes.
fft size = 5184K, square time = 0.3257 msec, threads 32
fft size = 5184K, square time = 0.3291 msec, threads 64
fft size = 5184K, square time = 0.3289 msec, threads 128
fft size = 5184K, square time = 0.3288 msec, threads 256
fft size = 5184K, square time = 0.3293 msec, threads 512
fft size = 5184K, square time = 0.3300 msec, threads 1024

Best square time for fft = 5184K, time: 0.3257, t = 32

fft size = 5184K, ave time = 0.0443 msec, Norm1 threads 32, Norm2 threads 32
fft size = 5184K, ave time = 0.0534 msec, Norm1 threads 32, Norm2 threads 64
fft size = 5184K, ave time = 0.0524 msec, Norm1 threads 32, Norm2 threads 128
fft size = 5184K, ave time = 0.0525 msec, Norm1 threads 32, Norm2 threads 256
fft size = 5184K, ave time = 0.0522 msec, Norm1 threads 32, Norm2 threads 512
fft size = 5184K, ave time = 0.0526 msec, Norm1 threads 32, Norm2 threads 1024
fft size = 5184K, ave time = 0.4067 msec, Norm1 threads 64, Norm2 threads 32
fft size = 5184K, ave time = 0.4113 msec, Norm1 threads 64, Norm2 threads 64
fft size = 5184K, ave time = 0.4102 msec, Norm1 threads 64, Norm2 threads 128
fft size = 5184K, ave time = 0.4093 msec, Norm1 threads 64, Norm2 threads 256
fft size = 5184K, ave time = 0.4090 msec, Norm1 threads 64, Norm2 threads 512
fft size = 5184K, ave time = 0.4074 msec, Norm1 threads 64, Norm2 threads 1024
fft size = 5184K, ave time = 0.3929 msec, Norm1 threads 128, Norm2 threads 32
fft size = 5184K, ave time = 0.3937 msec, Norm1 threads 128, Norm2 threads 64
fft size = 5184K, ave time = 0.3940 msec, Norm1 threads 128, Norm2 threads 128
fft size = 5184K, ave time = 0.3950 msec, Norm1 threads 128, Norm2 threads 256
fft size = 5184K, ave time = 0.3950 msec, Norm1 threads 128, Norm2 threads 512
fft size = 5184K, ave time = 0.3946 msec, Norm1 threads 128, Norm2 threads 1024
fft size = 5184K, ave time = 0.3882 msec, Norm1 threads 256, Norm2 threads 32
fft size = 5184K, ave time = 0.3883 msec, Norm1 threads 256, Norm2 threads 64
fft size = 5184K, ave time = 0.3884 msec, Norm1 threads 256, Norm2 threads 128
fft size = 5184K, ave time = 0.3877 msec, Norm1 threads 256, Norm2 threads 256
fft size = 5184K, ave time = 0.3869 msec, Norm1 threads 256, Norm2 threads 512
fft size = 5184K, ave time = 0.3877 msec, Norm1 threads 256, Norm2 threads 1024
fft size = 5184K, ave time = 0.3860 msec, Norm1 threads 512, Norm2 threads 32
fft size = 5184K, ave time = 0.3860 msec, Norm1 threads 512, Norm2 threads 64
fft size = 5184K, ave time = 0.3861 msec, Norm1 threads 512, Norm2 threads 128
fft size = 5184K, ave time = 0.3856 msec, Norm1 threads 512, Norm2 threads 256
fft size = 5184K, ave time = 0.3845 msec, Norm1 threads 512, Norm2 threads 512
fft size = 5184K, ave time = 0.3866 msec, Norm1 threads 512, Norm2 threads 1024

Average time for fft= 5184K, all threads variations 0.3256 msec, threshold value for valid timings set to 0.7500 of this, 0.2442 msec
Warning, time for fft = 5184K, time: 0.0443 msec, t1 = 32, t2 = 32, t3 = 32 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Warning, time for fft = 5184K, time: 0.0534 msec, t1 = 32, t2 = 32, t3 = 64 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Warning, time for fft = 5184K, time: 0.0524 msec, t1 = 32, t2 = 32, t3 = 128 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Warning, time for fft = 5184K, time: 0.0525 msec, t1 = 32, t2 = 32, t3 = 256 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Warning, time for fft = 5184K, time: 0.0522 msec, t1 = 32, t2 = 32, t3 = 512 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Warning, time for fft = 5184K, time: 0.0526 msec, t1 = 32, t2 = 32, t3 = 1024 is below threshold 0.2442 msec (0.7500 of average 0.3256)
Timings below threshold were detected for 6 norm1 / mult / norm2 combinations for fft length 5184K and omitted from consideration for best.

Best time for fft = 5184K, time: 0.3845, t1 = 512, t2 = 32, t3 = 512
Using threads: norm1 512, mult 128, norm2 128.
Using up to 10854M GPU memory.
Selected B1=630000, B2=10710000, 1.7% chance of finding a factor
Starting stage 1 P-1, M89326001, B1 = 630000, B2 = 10710000, fft length = 5184K
Doing 908960 iterations
Iteration 100000 M89326001, 0xe14f06f8949c9abe, n = 5184K, CUDAPm1 v0.22 err = 0.05005 (5:50 real, 3.5019 ms/iter, ETA 47:12)
Iteration 200000 M89326001, 0x2270467c553262ac, n = 5184K, CUDAPm1 v0.22 err = 0.04785 (5:52 real, 3.5179 ms/iter, ETA 41:34)
Iteration 300000 M89326001, 0x5a9e1dbc55f055ff, n = 5184K, CUDAPm1 v0.22 err = 0.04785 (5:56 real, 3.5598 ms/iter, ETA 36:07)
Iteration 400000 M89326001, 0x08db3e9c13c343d2, n = 5184K, CUDAPm1 v0.22 err = 0.05078 (5:57 real, 3.5742 ms/iter, ETA 30:19)
Iteration 500000 M89326001, 0x523ce55fab10ec94, n = 5184K, CUDAPm1 v0.22 err = 0.05078 (5:58 real, 3.5762 ms/iter, ETA 24:22)
Iteration 600000 M89326001, 0x54ded79cc40cfee8, n = 5184K, CUDAPm1 v0.22 err = 0.05273 (5:58 real, 3.5774 ms/iter, ETA 18:25)
Iteration 700000 M89326001, 0xc99c3d9fc3a34ec0, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (5:57 real, 3.5727 ms/iter, ETA 12:26)
Iteration 800000 M89326001, 0x9d20b89d1a9a4877, n = 5184K, CUDAPm1 v0.22 err = 0.05273 (5:56 real, 3.5611 ms/iter, ETA 6:28)
Iteration 900000 M89326001, 0xefda9b1094553b12, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (5:56 real, 3.5583 ms/iter, ETA 0:31)
M89326001, 0x05d2c8d87dcf4f23, n = 5184K, CUDAPm1 v0.22
Stage 1 complete, estimated total time = 53:52
Starting stage 1 gcd.
M89326001 Stage 1 found no factor (P-1, B1=630000, B2=10710000, e=0, n=5184K CUDAPm1 v0.22)
Starting stage 2.
Using b1 = 630000, b2 = 10710000, d = 2310, e = 12, nrp = 240
Zeros: 475228, Ones: 552452, Pairs: 105088
Processing 1 - 240 of 480 relative primes.
Initializing pass... done. transforms: 17421, err = 0.04785, (31.27 real, 1.7951 ms/tran,  ETA NA)
Transforms: 205710 M89326001, 0x90102bd269087607, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (6:20 real, 1.8476 ms/tran, ETA 31:27)
Transforms: 196446 M89326001, 0x266c3a943dd54799, n = 5184K, CUDAPm1 v0.22 err = 0.05273 (6:08 real, 1.8721 ms/tran, ETA 25:33)
Transforms: 201980 M89326001, 0x621dda916a4e4cbb, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (6:18 real, 1.8750 ms/tran, ETA 19:21)

Processing 241 - 480 of 480 relative primes.
Initializing pass... done. transforms: 20111, err = 0.04785, (37.16 real, 1.8476 ms/tran,  ETA 18:45)
Transforms: 205504 M89326001, 0x750bff764daa4a29, n = 5184K, CUDAPm1 v0.22 err = 0.05078 (6:25 real, 1.8733 ms/tran, ETA 12:23)
Transforms: 196422 M89326001, 0x5945c6a5e2e76c0e, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (6:05 real, 1.8588 ms/tran, ETA 6:16)
Transforms: 201562 M89326001, 0x0e9d8ad7c2845c56, n = 5184K, CUDAPm1 v0.22 err = 0.04883 (6:14 real, 1.8586 ms/tran, ETA 0:00)

Stage 2 complete, 1245156 transforms, estimated total time = 38:39
Starting stage 2 gcd.
M89326001 Stage 2 found no factor (P-1, B1=630000, B2=10710000, e=12, n=5184K CUDAPm1 v0.22)

batch wrapper reports exit at Wed 11/21/2018  0:26:48.00
kriesel is offline   Reply With Quote
Old 2018-11-21, 12:57   #660
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

152B16 Posts
Default V0.22 manual report worked

Just like V0.20.
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26
Factoring program need help Citrix Lone Mersenne Hunters 8 2005-09-16 02:31
Factoring program ET_ Programming 3 2003-11-25 02:57

All times are UTC. The time now is 08:19.


Mon Aug 2 08:19:42 UTC 2021 up 10 days, 2:48, 0 users, load averages: 2.24, 2.13, 1.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.