mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
Thread Tools
Old 2019-10-21, 19:06   #1420
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

19×613 Posts
Default

Quote:
Originally Posted by kriesel View Post
Nonzero pseudorandomly selected shift for gpuowl PRP would be useful. It would make life easier for uncwilly et al in the double, triple, quad checking effort, and gpuowl results could be checked with gpuowl.
Question for George: Is the Primenet server set up to allow non-Prime95/mprime clients which support shift to do both initial-test and DC, or does it only allow such same-client-for-both-tests for Prime95/mprime?
ewmayer is offline   Reply With Quote
Old 2019-10-21, 20:08   #1421
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11101011101112 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Question for George: Is the Primenet server set up to allow non-Prime95/mprime clients which support shift to do both initial-test and DC, or does it only allow such same-client-for-both-tests for Prime95/mprime?
No way to know without looking at the PHP code. If the server does not consider that a valid double-check, then I'll need to fix the server's PHP code.
Prime95 is offline   Reply With Quote
Old 2019-10-22, 03:27   #1422
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2D7F16 Posts
Default

Quote:
Originally Posted by Prime95 View Post
No way to know without looking at the PHP code. If the server does not consider that a valid double-check, then I'll need to fix the server's PHP code.
Actually, it might make sense to keep things that way - only officially "trusted" clients (which I believe is just yours ATM) are allowed to run both tests on a given exponent, as a way to prevent a malicious user from submitting matching pairs of nonzero residues in order to accumulate project credit. But whichever way you & Aaron decide on, probably a good idea to check the server code to see what it is doing in regards to this.
ewmayer is offline   Reply With Quote
Old 2019-10-22, 06:13   #1423
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100101101111112 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Question for George: Is the Primenet server set up to allow non-Prime95/mprime clients which support shift to do both initial-test and DC, or does it only allow such same-client-for-both-tests for Prime95/mprime?
First case. We DC our own old work (done with P95) and also 100M-digits new work (done with cudaLucas) using cudaLucas, all mentioned programs use shifts, and the server never got angry with us, but happily accepting our results as DC. We would be grateful if this behavior is kept. Edit: at least for "presumed honest" users.

Last fiddled with by LaurV on 2019-10-22 at 06:16
LaurV is offline   Reply With Quote
Old 2019-11-05, 14:27   #1424
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,437 Posts
Default save the opencl compile?

Is it possible to save the opencl compile result at gpuowl launch for reuse, on NVIDIA K80 on Google Colaboratory, which is ubuntu 18.04.1?
kriesel is offline   Reply With Quote
Old 2019-11-05, 17:25   #1425
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by kriesel View Post
Is it possible to save the opencl compile result at gpuowl launch for reuse, on NVIDIA K80 on Google Colaboratory, which is ubuntu 18.04.1?
Are you asking because the opencl compilation is slow, and the launch frequent? (how slow is it -- how large would be the benefit?)

It might be possible, OpenCL does offer some binary kernel support (that could be used to save the compilation initially and reload it on subsequent runs of the same exponent on the same driver&hardware)
preda is offline   Reply With Quote
Old 2019-11-05, 18:48   #1426
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,437 Posts
Default

Quote:
Originally Posted by preda View Post
Are you asking because the opencl compilation is slow, and the launch frequent? (how slow is it -- how large would be the benefit?)

It might be possible, OpenCL does offer some binary kernel support (that could be used to save the compilation initially and reload it on subsequent runs of the same exponent on the same driver&hardware)
The launch is frequent; every twelve hours of gpu run time or less on Colab. It's a small optimization. I asked because I recalled it as a capability present in the past on linux (perhaps only with rocm, and I don't know what driver's present on Colab or how to ask it). Since I'm offering Colab scripts for reuse in my reference threads, it could have larger impact than just my own use. Here's a recent resume timing. Time stamps in UTC as usual.

Code:
2019-11-05 18:39:52 gpuowl 
2019-11-05 18:39:52 Note: no config.txt file found
2019-11-05 18:39:52 config: -use ORIG_X2 -block 200 -log 120000 -maxAlloc 10240 -user kriesel -cpu colab/K80 
2019-11-05 18:39:52 355000033 FFT 20480K: Width 256x4, Height 256x4, Middle 10; 16.93 bits/word
2019-11-05 18:39:52 OpenCL args "-DEXP=355000033u -DWIDTH=1024u -DSMALL_HEIGHT=1024u -DMIDDLE=10u -DWEIGHT_STEP=0x1.0d27019dccb6fp+0 -DIWEIGHT_STEP=0x1.e6fb0d049fbefp-1 -DWEIGHT_BIGSTEP=0x1.ae89f995ad3adp+0 -DIWEIGHT_BIGSTEP=0x1.306fe0a31b715p-1 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-05 18:39:54 

2019-11-05 18:39:54 OpenCL compilation in 1686 ms
2019-11-05 18:40:00 355000033 P1 B1=2760000, B2=66240000; 3981446 bits; starting at 3981445
At full 12 hour session duration, it's only about 40 ppm. Some are experiencing much earlier termination; 1 hour is not unusual; ~480 ppm in that case.

Last fiddled with by kriesel on 2019-11-05 at 18:53
kriesel is offline   Reply With Quote
Old 2019-11-07, 03:42   #1427
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,437 Posts
Default

Quote:
Originally Posted by kriesel View Post
At full 12 hour session duration, it's only about 40 ppm. Some are experiencing much earlier termination; 1 hour is not unusual; ~480 ppm in that case.
That's at the fast end of compile time, 1.6 seconds. I have seen up to 3.4 seconds. It's every Colab session or every fft length change, whichever comes first. I finally got a P100 I think, during a 411M P-1 stage 1, and it's way faster than a K80; ~6ms/iter instead of ~23.
kriesel is offline   Reply With Quote
Old 2019-11-10, 22:27   #1428
philbo0042
 
Oct 2019

2×7 Posts
Default

I apologize if this has been asked several times. I am a complete noob to almost all of what GPU computing requires.

I have searched and perused this thread looking for information on how to set up gpuOwL with my Nvidia GTX 860M. I have attached gpuOwL.log that shows error codes I do not understand. I have installed mysys2 and minggw but may have selected the wrong settings during installation. I have also installed Visual Studio.

Can someone help, please?

Any information would be greatly appreciated. If additional information is needed from me, please let me know and I will be more than glad to provide it.

Thank you in advance!
Attached Files
File Type: log gpuowl.log (3.0 KB, 54 views)
philbo0042 is offline   Reply With Quote
Old 2019-11-10, 23:55   #1429
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,437 Posts
Default

Quote:
Originally Posted by philbo0042 View Post
Can someone help, please?
Try adding -use ORIG_X2
as in
Code:
gpuowl-win -device 0 -use ORIG_X2 -user kriesel -cpu condorella/rx480
in a command line or batch file, or put the option in config.txt
It's been mentioned before, but there are a LOT of posts to search through.

Last fiddled with by kriesel on 2019-11-11 at 00:47
kriesel is offline   Reply With Quote
Old 2019-11-11, 00:13   #1430
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001111012 Posts
Default Low bounds P-1 specification etc.

In the gpuowl v6.11-9-g9ae3189 built-in help, it says in part
Code:
-B1                : P-1 B1 bound, default 500000
-B2                : P-1 B2 bound, default B1 * 30
What it doesn't say is minimum B1 is 15015.
Code:
2019-11-10 17:04:34 B1=10000 too small, adjusted to 15015
2019-11-10 17:04:34 102001127 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.69 bits/word
2019-11-10 17:04:37 OpenCL args "-DEXP=102001127u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0x9.f10dfae44e32p-3 -DIWEIGHT_STEP=0xc.e00add36b0
688p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-10 17:04:41 OpenCL compilation in 3321 ms
2019-11-10 17:04:42 102001127 P1 B1=15015, B2=300000; 21677 bits; starting at 0
There seems to be no way to run only stage 1. Minimum B2 is > B1.
Code:
2019-11-10 17:19:11 B2=15015 too small, adjusted to 30030
Also, a later run with larger bounds fails as follows, apparently by design. (And yes it has a " on a line all by itself.)
Code:
2019-11-10 17:48:33 B2=850000 too small, adjusted to 1700000
2019-11-10 17:48:33 102001127 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.69 bits/word
2019-11-10 17:48:33 OpenCL args "-DEXP=102001127u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0x9.f10dfae44e32p-3 -DIWEIGHT_STEP=0xc.e00add36b0
688p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-11-10 17:48:35 OpenCL compilation in 2736 ms
2019-11-10 17:48:36 102001127 P1 wants B1=850000 but savefile has B1=100000. Fix B1 or move savefile
2019-11-10 17:48:36 'C:\msys64\home\ken\gpuowl-compile\v6.11-9-g9ae3189\102001127\102001127.p1.owl' invalid
2019-11-10 17:48:36 102001127 P1 wants B1=850000 but savefile has B1=100000. Fix B1 or move savefile
2019-11-10 17:48:36 'C:\msys64\home\ken\gpuowl-compile\v6.11-9-g9ae3189\102001127\102001127-old.p1.owl' invalid
2019-11-10 17:48:36 Exiting because "invalid savefiles found, investigate why
"
There is apparently no ability to extend a stage 1 run or presumably a stage 2 run on the same exponent.

Last fiddled with by kriesel on 2019-11-11 at 00:26
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 07:16.


Fri Aug 6 07:16:09 UTC 2021 up 14 days, 1:45, 1 user, load averages: 3.25, 2.96, 2.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.