mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2019-07-29, 21:37   #1
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Tallahassee, FL

35 Posts
Default cudaPm1 Stage 2 does not start

This thread is reposted to avoid hijacking dominicanpapi82's thread.

333898333

My computer ran cudapm1 but it refuses to commence stage 2 after stage 1 is complete. The program executes but immediately (1-3 seconds) exits after throwing the following code, without any changes to the directory.

Code:
Using up to 6560M GPU memory.
Selected B1=3130000, B2=71207500, 3.82% chance of finding a factor 
Using B1 = 3130000 from savefile. 
Continuing stage 2 from a partial result of M333898333 fft length = 20480K 
Starting stage 2. 
Using b1 = 3130000, b2 = 71207500, d = 2310, e = 12, nrp = 21
(program auto exists after this line ...)
Any help would be appreciated.

Update: I have started PRP test and skipping the P-1. If anyone interested to run P-1 Stage 2, you can find my stage 1 save and worktodo.txt for cudapm1 below (Google Drive). I will remove the files upon completion of PRP or P-1.

https://drive.google.com/open?id=1Ph...Trkb-uhTc5iLhx
dcheuk is offline   Reply With Quote
Old 2019-07-30, 17:07   #2
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

Smells like memory allocation failure. Stage 2 needs a lot of RAM. How much RAM does the card have?
OTOH, you should report stage 1 result, use B2=B1 and do a manual report.

Last fiddled with by LaurV on 2019-07-30 at 17:09
LaurV is offline   Reply With Quote
Old 2019-07-30, 21:31   #3
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×7×43 Posts
Default

Sometimes a retry or 3 will work. Sometimes bumping the fft length up on the command line and retrying will work. Sometimes it's helped by reducing the interval between save files. What gpu was this run on?

From post 373 in http://www.mersenneforum.org/showthread.php?t=17835
"excessive stage 2 round-off errors simply halt the program without error messages."

Last fiddled with by kriesel on 2019-07-30 at 21:31
kriesel is online now   Reply With Quote
Old 2019-07-30, 23:15   #4
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Tallahassee, FL

111100112 Posts
Default

Quote:
Originally Posted by LaurV View Post
Smells like memory allocation failure. Stage 2 needs a lot of RAM. How much RAM does the card have?
OTOH, you should report stage 1 result, use B2=B1 and do a manual report.
Quote:
Originally Posted by kriesel View Post
Sometimes a retry or 3 will work. Sometimes bumping the fft length up on the command line and retrying will work. Sometimes it's helped by reducing the interval between save files. What gpu was this run on?

From post 373 in http://www.mersenneforum.org/showthread.php?t=17835
"excessive stage 2 round-off errors simply halt the program without error messages."
It seems like someone has completed the P-1 that was quick. Thanks One Man!

The card is an RTX 2080, I believe it has 8GB of gddr6 some reason it says only about 6gb was available.
dcheuk is offline   Reply With Quote
Old 2019-07-31, 00:21   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

124528 Posts
Default

Quote:
Originally Posted by dcheuk View Post
It seems like someone has completed the P-1 that was quick. Thanks One Man!

The card is an RTX 2080, I believe it has 8GB of gddr6 some reason it says only about 6gb was available.
Perhaps some gpu ram was occupied by the display, but that's a lot of difference.

The bounds reported by One Man B1=97,122, B2=1,165,464 are tiny compared to what's appropriate; see https://www.mersenne.ca/exponent/333898333
gpu72 bounds B1=2,600,000 B2=59,800,000; 123 GhzDays 3.5% probability of factor; half a week on Tesla C2075, or 1.5 days on GTX1080Ti; around 6 days for an i7-8750H 6-core worker. (All figures for both stages; about half as long for 1 stage.)

CUDAPm1 runs in progress can be promoted to gpus with equal or more RAM, but typically stage 2 does not demote to smaller ram cards and work, so my Tesla is out (6GB nom, 5.25 net of ECC). I don't have a way of moving P-1 in progress between CUDAPm1 and prime95, so the cpus are out.
You might want to repost that interim file. The earlier link gives a 404 error now. If you did, I might give it a shot after some other work clears out of the queue.

Last fiddled with by kriesel on 2019-07-31 at 00:54
kriesel is online now   Reply With Quote
Old 2019-07-31, 01:39   #6
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Tallahassee, FL

24310 Posts
Default

Quote:
Originally Posted by kriesel View Post
Perhaps some gpu ram was occupied by the display, but that's a lot of difference.

The bounds reported by One Man B1=97,122, B2=1,165,464 are tiny compared to what's appropriate; see https://www.mersenne.ca/exponent/333898333
gpu72 bounds B1=2,600,000 B2=59,800,000; 123 GhzDays 3.5% probability of factor; half a week on Tesla C2075, or 1.5 days on GTX1080Ti; around 6 days for an i7-8750H 6-core worker. (All figures for both stages; about half as long for 1 stage.)

CUDAPm1 runs in progress can be promoted to gpus with equal or more RAM, but typically stage 2 does not demote to smaller ram cards and work, so my Tesla is out (6GB nom, 5.25 net of ECC). I don't have a way of moving P-1 in progress between CUDAPm1 and prime95, so the cpus are out.
You might want to repost that interim file. The earlier link gives a 404 error now. If you did, I might give it a shot after some other work clears out of the queue.
Oh okay I fixed the link the folder is up again.

I ran it on an identical graphics card without any display attached and produced the following message

Code:
No GeForceRTX2080_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.
CUDA reports 6705M of 8192M GPU memory free.
Using threads: norm1 512, mult 256, norm2 256.
No stage 2 checkpoint.
Using up to 6560M GPU memory.
Selected B1=3130000, B2=71207500, 3.82% chance of finding a factor
Using B1 = 3130000 from savefile.
Continuing stage 2 from a partial result of M333898333 fft length = 20480K
Starting stage 2.
Using b1 = 3130000, b2 = 71207500, d = 2310, e = 12, nrp = 21
... then it crashed/quit again. No warnings, errors, no files modified.
dcheuk is offline   Reply With Quote
Old 2019-07-31, 01:45   #7
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Tallahassee, FL

35 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Oh okay I fixed the link the folder is up again.

I ran it on an identical graphics card without any display attached and produced the following message

Code:
No GeForceRTX2080_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.
CUDA reports 6705M of 8192M GPU memory free.
Using threads: norm1 512, mult 256, norm2 256.
No stage 2 checkpoint.
Using up to 6560M GPU memory.
Selected B1=3130000, B2=71207500, 3.82% chance of finding a factor
Using B1 = 3130000 from savefile.
Continuing stage 2 from a partial result of M333898333 fft length = 20480K
Starting stage 2.
Using b1 = 3130000, b2 = 71207500, d = 2310, e = 12, nrp = 21
... then it crashed/quit again. No warnings, errors, no files modified.
Found it weird that only 6705M of 8192M is free. Restarted the pc, got the same message as above, and then as expected it crashed.

Note that this is ran on the secondary (but identical) graphics card on a computer with no external display plugged in. Hmmm it's weird.
dcheuk is offline   Reply With Quote
Old 2019-07-31, 05:58   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·32·7·43 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Found it weird that only 6705M of 8192M is free. Restarted the pc, got the same message as above, and then as expected it crashed.

Note that this is ran on the secondary (but identical) graphics card on a computer with no external display plugged in. Hmmm it's weird.
Yes, odd.
Which version of CUDAPm1 are you running? I use v0.20 not v0.22 for production.
I notice you have no fft file for your gpu, indicating you haven't done the fft tuning or threads tuning yet.
kriesel is online now   Reply With Quote
Old 2019-07-31, 19:53   #9
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Tallahassee, FL

3638 Posts
Default

Quote:
Originally Posted by kriesel View Post
Yes, odd.
Which version of CUDAPm1 are you running? I use v0.20 not v0.22 for production.
I notice you have no fft file for your gpu, indicating you haven't done the fft tuning or threads tuning yet.
Oh oops my bad. Tried it on a different card but forgot about the benchmarkings.

cudapm1 0.22.

Did it again now and got exactly the same message. Oh well.

Last fiddled with by dcheuk on 2019-07-31 at 20:04
dcheuk is offline   Reply With Quote
Old 2019-07-31, 21:18   #10
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·32·7·43 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Oh oops my bad. Tried it on a different card but forgot about the benchmarkings.

cudapm1 0.22.

Did it again now and got exactly the same message. Oh well.
Quick death reproducible here, CUDAPm1 V0.20, GTX 1080 Ti.
Code:
batch wrapper reports (re)launch at Wed 07/31/2019 15:58:59.30 reset count 0 of max 3 
CUDAPm1 v0.20
------- DEVICE 0 -------
name                GeForce GTX 1080 Ti
Compatibility       6.1
clockRate (MHz)     1620
memClockRate (MHz)  5505
totalGlobalMem      zu
totalConstMem       zu
l2CacheSize         2883584
sharedMemPerBlock   zu
regsPerBlock        65536
warpSize            32
memPitch            zu
maxThreadsPerBlock  1024
maxThreadsPerMP     2048
multiProcessorCount 28
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      2147483647,65535,65535
textureAlignment    zu
deviceOverlap       1

CUDA reports 10988M of 11264M GPU memory free.
Using threads: norm1 32, mult 32, norm2 64.
No stage 2 checkpoint.
Using up to 5120M GPU memory.
Selected B1=2740000, B2=67130000, 3.71% chance of finding a factor
Using B1 = 3130000 from savefile.
Continuing stage 2 from a partial result of M333898333 fft length = 20480K
batch wrapper reports exit at Wed 07/31/2019 16:00:23.95
Will try a couple other things. Renamed c file out of the way, redoing s1 gcd from t file now. If it works, it will give bigger NRP, maybe larger e, on the 11GB.
kriesel is online now   Reply With Quote
Old 2019-07-31, 22:23   #11
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·32·7·43 Posts
Default

Quote:
Originally Posted by kriesel View Post
Renamed c file out of the way, redoing s1 gcd from t file now. If it works, it will give bigger NRP, maybe larger e, on the 11GB.
Nope.
Code:
batch wrapper reports (re)launch at Wed 07/31/2019 16:00:24.21 reset count 1 of max 3 
CUDAPm1 v0.20
------- DEVICE 0 -------
name                GeForce GTX 1080 Ti
Compatibility       6.1
clockRate (MHz)     1620
memClockRate (MHz)  5505
totalGlobalMem      zu
totalConstMem       zu
l2CacheSize         2883584
sharedMemPerBlock   zu
regsPerBlock        65536
warpSize            32
memPitch            zu
maxThreadsPerBlock  1024
maxThreadsPerMP     2048
multiProcessorCount 28
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      2147483647,65535,65535
textureAlignment    zu
deviceOverlap       1

CUDA reports 10988M of 11264M GPU memory free.
Using threads: norm1 32, mult 32, norm2 64.
Using up to 5120M GPU memory.
Selected B1=2740000, B2=67130000, 3.71% chance of finding a factor
Using B1 = 3130000 from savefile.
Continuing stage 1 from a partial result of M333898333 fft length = 20480K, iteration = 4515001
M333898333, 0xea7d398e8effff52, n = 20480K, CUDAPm1 v0.20
Stage 1 complete, estimated total time = 27:58:46batch wrapper reports exit at Wed 07/31/2019 16:37:40.58
Code:
Problem signature:
  Problem Event Name:    APPCRASH
  Application Name:    CUDAPm1_win64_20130923_CUDA_55.exe
  Application Version:    0.0.0.0
  Application Timestamp:    523f9925
  Fault Module Name:    CUDAPm1_win64_20130923_CUDA_55.exe
  Fault Module Version:    0.0.0.0
  Fault Module Timestamp:    523f9925
  Exception Code:    c0000005
  Exception Offset:    000000000000d884
  OS Version:    6.1.7601.2.1.0.256.48
  Locale ID:    1033
  Additional Information 1:    44b2
  Additional Information 2:    44b2372ff3e894f68e6c85eaaa6183b2
  Additional Information 3:    cc46
  Additional Information 4:    cc46d3b48197ff03ca4e5004a6dbb86f

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt
Exception c0..05 is access violation. Similar results from that resulting c file, with a variety of fft lengths in v0.20 and v0.22. (Which are much faster to try since it crashes quickly, but my cpu is slow so gcds are long.)
Now it's starting to tick me off.

Last fiddled with by kriesel on 2019-07-31 at 22:32
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
CUDAPm1-specific reference material kriesel kriesel 12 2019-08-12 15:51
CudaPm1 not working? robertfrost Software 3 2019-01-05 17:04
Stage 1 with mprime/prime95, stage 2 with GMP-ECM D. B. Staple Factoring 2 2007-12-14 00:21
Need help to run stage 1 and stage 2 separately jasong GMP-ECM 9 2007-10-25 22:32
Stage 1 and stage 2 tests missing Matthias C. Noc PrimeNet 5 2004-08-25 15:42

All times are UTC. The time now is 18:04.


Sun Aug 1 18:04:18 UTC 2021 up 9 days, 12:33, 0 users, load averages: 2.34, 2.45, 2.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.