![]() |
|
|
#738 | |
|
Aug 2010
Kansas
54710 Posts |
Quote:
I did, however, find out that I can't run either mfaktc or CUDAPm1 at the same time as 4x P-1 in P95, even though I've got 32GB RAM and a Threadripper. Froze my computer within about 5 minutes of starting with both programs. Yikes lol Edit: Is it possible to do Stage 1 only with CUDAPm1 and then perform Stage 2 in P95? Just thinking since I have quite a bit of RAM available, this could allow me to focus on some deeper runs of P-1 Last fiddled with by c10ck3r on 2019-09-01 at 02:52 |
|
|
|
|
|
|
#739 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
On prime95 I mix PRP and P-1. It seems to like one P-1 per cpu package better, although that could be system dependent. P-1 likes lots of memory. Re starting with stage 1 on CUDAPm1 and finishing with stage 2 on prime95 for the same exponent, I think that would require some software development, if it's possible at all. See https://www.mersenneforum.org/showpo...3&postcount=24 |
|
|
|
|
|
|
#740 |
|
Random Account
Aug 2009
Not U. + S.A.
22·5·11·13 Posts |
I will put this here hoping someone will see it...
On occasion, CUDAPm1 will stop running and exit after completing stage 1. This behavior could suggest that it does not detect enough RAM onboard my GPU for stage 2. It is a GTX 1080 with 8 GB of RAM. Below is how I ran it, by command line: Code:
cudapm1 98181383 -b1 710000 -b2 13490000 I am running it again with Prime95. Is it possible this B2 is too large to run on my GPU? |
|
|
|
|
|
#741 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
Normally I would expect that exponent to run to completion on that model gpu easily. (Or even ~triple that exponent.) But odd things happen sometimes in CUDAPm1. Sometimes a resumption after a stop will do the job. I've hit spots (on the exponent number line) that one gpu can't finish, that if I move it to another gpu of the same or larger gpu ram size, the second can finish. Even in one case, the same model but a different gpu BIOS rev. Quadro 2000's hit issues around 85.5 and 171M exponent. (That model is not recommended for P-1 because the feasible or program-selected bounds are too low.) There's another thing that happens sometimes that the author described. An excessive roundoff error in stage 2 will produce a very quiet exit, no error message to say why. For more info, see https://www.mersenneforum.org/showpo...65&postcount=7 Limits https://www.mersenneforum.org/showpo...73&postcount=9 Limit and run time detail by gpu model https://www.mersenneforum.org/showpo...37&postcount=3 Errors in general https://www.mersenneforum.org/showpo...82&postcount=7 P-1 progress http://www.mersenneforum.org/showpos...34&postcount=3 CUDAPm1 bug and wish list The most recent versions of Gpuowl can also run on NVIDIA and do P-1 including save files. There may be bugs in it that prevent it from completing P-1 on some exponents, but I haven't found them yet, in admittedly much less sampling of exponents on gpuowl than on CUDAPm1. On large-gpu-ram models like the GTX1080, gpuowl appears capable of going to higher exponents. See https://www.mersenneforum.org/showpo...5&postcount=17 Run times would be about 30% longer on a GTX1080 than the GTX1080Ti Zip file for Windows gpuowl v6.11-9 at https://www.mersenneforum.org/showpo...postcount=1403 In any event, have fun! Last fiddled with by kriesel on 2019-11-15 at 18:22 |
|
|
|
|
|
|
#742 | |
|
Random Account
Aug 2009
Not U. + S.A.
22·5·11·13 Posts |
Quote:
![]() The roundoff error never exceeded 0.035. I believe this is what is displayed on every line as "err." Also, it completed the GCD before it dropped out. There was not a results file. Windows 10, which I am running, has a very detailed Task Manager. It displays information about a GPU, if one is present. At the time, I didn't think the GPU's RAM usage was excessive. It is possible I was looking at stage 1 though. gpuOwl. I have seen posts about it going back a while. I was under the impression it was a Linux only project. I will give it a try. It uses OpenCL. GPU-Z says I have this capability but I have never ran anything which uses it. Microsoft has continually updated Windows 10. What I have is v1903 plus some maintenance updates. After a point, the older version of CUDAPm1 and CUDALucas simply would not start. I replaced them with newer ones I found in James' archive on mersenne.org. The newer ones did not seem to have any problems, until today. It will take me a while to digest everything in your links. I appreciate the effort. Edit I tried gpuOwl. It gives me all the below and then exits. It would help to see a config file and worktodo example. Code:
2019-11-15 18:01:36 gpuowl v6.11-9-g9ae3189
2019-11-15 18:01:36 Note: no config.txt file found
2019-11-15 18:01:36 config: -pm1 98181383
2019-11-15 18:01:36 98181383 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.02 bits/word
2019-11-15 18:01:37 OpenCL args "-DEXP=98181383u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xf.bbe27b81b7e38p-3 -DIWEIGHT_STEP=0x8.22a2337ec7b7p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-15 18:01:37 OpenCL compilation error -11 (args -DEXP=98181383u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xf.bbe27b81b7e38p-3 -DIWEIGHT_STEP=0x8.22a2337ec7b7p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -I. -cl-fast-relaxed-math -cl-std=CL2.0)
2019-11-15 18:01:37 <kernel>:197:3: error: invalid output constraint '=v' in asm
X2(u[0], u[2]);
^
<kernel>:174:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \
^
<kernel>:197:3: error: invalid output constraint '=v' in asm
<kernel>:175:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.y) : "v" (t.y), "v" (b.y)); \
^
<kernel>:198:3: error: invalid output constraint '=v' in asm
X2_mul_t4(u[1], u[3]);
^
<kernel>:180:37: note: expanded from macro 'X2_mul_t4'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (t.x) : "v" (b.x), "v" (t.x)); \
^
<kernel>:198:3: error: invalid output constraint '=v' in asm
<kernel>:181:37: note: expanded from macro 'X2_mul_t4'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.y), "v" (b.y)); \
^
<kernel>:199:3: error: invalid output constraint '=v' in asm
X2(u[0], u[1]);
^
<kernel>:174:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \
^
<kernel>:199:3: error: invalid output constraint '=v' in asm
<kernel>:175:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.y) : "v" (t.y), "v" (b.y)); \
^
<kernel>:200:3: error: invalid output constraint '=v' in asm
X2(u[2], u[3]);
^
<kernel>:174:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.x) : "v" (t.x), "v" (b.x)); \
^
<kernel>:200:3: error: invalid output constraint '=v' in asm
<kernel>:175:37: note: expanded from macro 'X2'
__asm( "v_add_f64 %0, %1, -%2\n" : "=v" (b.y) : "v" (t.y), "v" (b.y)); \
^
<kernel>:266:3: error: invalid output constraint '=v' in a2019-11-15 18:01:37 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:229 build
2019-11-15 18:01:37 Bye
Last fiddled with by storm5510 on 2019-11-15 at 23:09 Reason: Additional |
|
|
|
|
|
|
#743 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Quote:
|
|
|
|
|
|
|
#744 | |
|
Random Account
Aug 2009
Not U. + S.A.
22×5×11×13 Posts |
Quote:
Edit: Please disregard. With some experimentation, I have it running. Last fiddled with by storm5510 on 2019-11-16 at 00:50 |
|
|
|
|
|
|
#745 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
For the gpuowl issue you had, try using -use ORIG_X2 or -use FMA_X2 in the gpuowl command line. It's defaulting to INLINE_X2 and not handling it well. ORIG or FMA may be faster; try and see. GpuOwl has always been developed on linux. However, going back to the very earliest versions, it has also been ported to Windows. Mostly kracker and I have posted Windows compiled versions. (Hope I haven't slighted anyone there.) The bigger change was when it also was made to work on NVIDIA even though Preda generally only owns AMD gpus for development testing. I even tested it on Intel IGPs a couple times. |
|
|
|
|
|
|
#746 | |
|
Random Account
Aug 2009
Not U. + S.A.
22×5×11×13 Posts |
Quote:
As for gpuOwl, I did not have the entire package, just an update. I found the rest in James Heinrich's archive on mersenne.ca. I recall writing this in another topic. No matter. Despite having to do some experimentation, it runs quite well now. I successfully guessed a config.txt layout. It still refuses my worktodo.txt files though. Incorrect syntax, I believe it says. |
|
|
|
|
|
|
#747 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
cudapm1 -f 4608k When in doubt, use -h to see the options. And note, despite what that says, -r does nothing. For worktodo syntax for any of the common GIMPS applications, see https://www.mersenneforum.org/showpo...8&postcount=22 |
|
|
|
|
|
|
#748 | |
|
Random Account
Aug 2009
Not U. + S.A.
22·5·11·13 Posts |
Quote:
I was not sure about the command line format as I have never ran it this way before. The "-h" command displayed a lot of other things, but not much about the basics. However, I have it running. For what I was running, it decided on 5760 for an FFT size. I believe this was what it was using before when it quietly stopped. The command line insisted on multiples of 1024, so I set it to 6144 Now, it is wait and see. I hope you do not mind: I copied all the text from your link above and saved it here. I have been looking for something like this for years!
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3628 | 2023-04-17 22:08 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |
| World's dumbest CUDA program? | xilman | Programming | 1 | 2009-11-16 10:26 |
| Factoring program need help | Citrix | Lone Mersenne Hunters | 8 | 2005-09-16 02:31 |
| Factoring program | ET_ | Programming | 3 | 2003-11-25 02:57 |