mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2019-06-23, 16:01   #1255
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I thought Mihai had agreed to make type-1 residues gpuowl's default. However, it is still producing type-4.
Gpuowl implemented LL through v0.6, PRP with residue type 4 initially (v0.7 to at least 1.1), switched to type 1 by v1.5 and continued it to at least v3.9, then back to type 4 when PRP-1 was added in v4; when P-1 was separated in v6.0, PRP3 remained type 4 through at least v6.5.

I've started a reference table available at https://www.mersenneforum.org/showpo...3&postcount=15 including a couple other variables too (like when nonzero offset was available in gpuowl, or Jacobi check available in the LL flavors). It's incomplete and a work in progress. I haven't tested, built, downloaded, or even identified the commits for all the 0.1 increment versions yet.

Some useful versions in my opinion are:
v0.5 LL with pseudorandom offset, no Jacobi check; most efficient near the upper limit of the 4M fft ~70-77M exponent; useful for helping DC past LL first tests

v0.6 LL with Jacobi check for helping DC past LL first tests done with nonzero offset; most efficient near the upper limit of the 4M fft ~70-77M exponent; I think zero offset only

v1.9 PRP DC, 4M is fast, limited to zero offset, type 1 residues. (2, 4, 8M; fastest times for each that I've seen in testing on RX480. Although driver updates necessary for v2.0 support that caused a 5% slowdown affected that.)

v3.8 PRP, 8M for ~150M exponents is fast; type 1 residues, zero offset limitation

V6.2-6.5 PRP type 4 residues, many fft lengths, and speeds I've checked are competitive with the best of the previous versions, latest and greatest, limited to zero offset, separate P-1 (which runs for some but I've had crashes with the P-1 in every attempt)

Iteration timing benchmarks vs. a variety of gpuowl versions and fft lengths run on the same system and RX480 gpu are available at https://www.mersenneforum.org/showpo...35&postcount=2

Switching between versions and supporting multiple versions is easy. I have dozens on one system with 2 AMD gpus. I use a separate directory for each, shortcuts to get there, and simple batch files containing the executable name and the usual command line options (this is on Windows 7 or 10 typically). For example, g65.bat for V6.5 is
Code:
gpuowl-win -device 0 -carry short -fft +0 -use ORIG_X2

:dev 0 rx480, 1 rx550
:  -carry long -fft +0  -carry short -use FMA_X2  -use ORIG_X2
I find it handy to have a reminder in comments there which gpu model is which device number, on each system, especially for 3 or more per system, and to have different options there in comments for fast convenient copy/paste into the command in line one.

Last fiddled with by kriesel on 2019-06-23 at 16:41
kriesel is offline   Reply With Quote
Old 2019-06-23, 19:21   #1256
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default PRP offset branch

Don't know how to modify SELROC's make directions in post 1076 to do a git branch such as PRP-offset. (Attempts made, results not pretty.) https://github.com/preda/gpuowl/tree/prp-offset So, I tried building it for Windows after downloading and unzipping a zip file, and editing the makefile a bit to correspond to how I had previously built V3.8, since their commit dates are only days apart:
Code:
$ make openowl-notf
g++ -O2 -DREV=\"ae3be65\" -std=c++14 OpenGpu.cpp NoTF.cpp clwrap.cpp common.cpp gpuowl.cpp -o openowl-notf -lOpenCL -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -static
strip openowl-notf.exe
It compiles. It runs, apparently correctly. But there's no indication of a nonzero offset, in console output, gpuowl.log, help output, or results.
Code:
{"exponent":1398269, "worktype":"PRP-3", "status":"P", "program":{"name":"gpuowl", "version":"3.8-ae3be65-OpenCL"}, "timestamp":"2019-06-23 18:58:54 UTC", "computer":"Ellesmere-36x1266-@28:0.0", "aid":"0", "residue-type":1, "fft-length":"512K", "res64":"0000000000000001", "errors":{"gerbicz":0}}
kriesel is offline   Reply With Quote
Old 2019-06-23, 20:59   #1257
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·33·109 Posts
Default

I believe the following should work(untested). It is probably possible to simplify.

Code:
git clone https://github.com/preda/gpuowl
git fetch --all
git checkout prp-offset
henryzz is offline   Reply With Quote
Old 2019-07-01, 05:46   #1258
SELROC
 

64708 Posts
Default

I don't know yet what Preda thinks about this. I have found an array index warning in gpuOwl. FFT 8K.



https://github.com/preda/gpuowl/issues/56
  Reply With Quote
Old 2019-07-03, 20:45   #1259
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

5·11·137 Posts
Default

The current version does not output any Gerbicz information in the JSON text. See below. Primenet thus fails to mark the PRP test as "highly reliable". BTW, the test below had several failed Gerbicz checks. The count of such failures would be useful in the JSON output.
Code:
{"exponent":"87944903", "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"v6.5-82-g77b45a4"}, "timestamp":"2019-07-03 16:55:14 UTC", "user":"gw2", "computer":"radeon2.2", "aid":"8FAA3EF0B7F73F7029BC6154D749FF2D", "fft-length":5242880, "res64":"d730c1a17c8fcd1e", "residue-type":4}
For comparison, a prime95 JSON output:

Code:
{"status":"C", "exponent":85527073, "worktype":"PRP-3", "res64":"59AC64DACB6891E4", "residue-type":1, "res2048":"E77683E0E56D070B43DAD890B2957616AE4A6EA891AC9672365B8D3725A17ADC9E82404B0DDB73D9827F2DA3442BE9D111A230DAB332BF7F120A16127AF22768AC2B7A34EA260A772618F53D7D8645CEE444F63F30D95CB453289B3761C05CC67C736A31B99FB65980B48A36A7BAEAEEA354984B2FD8ABE6D664B7B0ADD2005652E8B207FF2E8673804AB8E1DC27A679C760AC9256070F4BAD18A250E52E4FD17A592534D80EEA858B8E69D000CB32A6455E111D3F11576DD30FECE328DD397EF63121DFA6447EA7BF5091636B289192E7FD858035033133ACA6C0A08DAB00DAAAE8A8162254CCCD0B7B69888D19CE66F1E48C6C9013865F59AC64DACB6891E4", "fft-length":4718592, "shift-count":9773447, "error-code":"00000000", "security-code":"728605AD", "program":{"name":"Prime95", "version":"29.7", "build":1, "port":8}, "timestamp":"2019-06-12 23:46:20", "errors":{"gerbicz":0}, "user":"gw_2", "computer":"h110itx1", "aid":"FAFE04EE26AE5DB345E585E8913E1C75"}

LaurV: edited to wrap code tags around the json files, they created a mess on screen due to long, unterminated lines ("beautify"-ing it in your editor, if you use pn or n++ or else, may help, before posting, so we can see it nicely indented )

Last fiddled with by LaurV on 2019-07-06 at 03:00 Reason: as explained
Prime95 is online now   Reply With Quote
Old 2019-07-03, 20:47   #1260
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

5×11×137 Posts
Default

Quote:
Originally Posted by SELROC View Post
I don't know yet what Preda thinks about this. I have found an array index warning in gpuOwl. FFT 8K.
Since preda has not answered yet, the warning is harmless. I'm sure he'll fix it when he has the time.
Prime95 is online now   Reply With Quote
Old 2019-07-03, 22:42   #1261
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001010112 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The current version does not output any Gerbicz information in the JSON text. See below. Primenet thus fails to mark the PRP test as "highly reliable". BTW, the test below had several failed Gerbicz checks. The count of such failures would be useful in the JSON output.

{"exponent":"87944903", "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"v6.5-82-g77b45a4"}, "timestamp":"2019-07-03 16:55:14 UTC", "user":"gw2", "computer":"radeon2.2", "aid":"8FAA3EF0B7F73F7029BC6154D749FF2D", "fft-length":5242880, "res64":"d730c1a17c8fcd1e", "residue-type":4}
Confirmed here (and also the case for v5.0-9c13870 or V4.3). But earlier versions did. For example, V1.9, V3.8 (redacted result, with one EE occurrence on Gerbicz check, so it resumed from an earlier saved residue and repeated the Gerbicz block, successfully on the second attempt)
Code:
2019-03-03 06:58:57 condorella-rx550 {"exponent":83411351, "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"3.8-91c52fa-OpenCL"}, "timestamp":"2019-03-03 12:58:57 UTC", "user":"kriesel", "computer":"condorella-rx550", "aid":"redacted", "residue-type":1, "fft-length":"4608K", "res64":"redacted", "errors":{"gerbicz":1}}
and V3.9.

Last fiddled with by kriesel on 2019-07-03 at 22:53
kriesel is offline   Reply With Quote
Old 2019-07-05, 07:27   #1262
SELROC
 

3·2,659 Posts
Default

Note: I have a script that quickly recovers after a power loss.


https://github.com/valeriob01/Mersen...7ef4ec01174485
  Reply With Quote
Old 2019-07-08, 04:26   #1263
SELROC
 

2×3×13×97 Posts
Default

Current gpuowl typical performance numbers for 89M exponents:


RX580: 3849 us/sq - ETA 4d 2h
Vega64: 2010 us/sq - ETA 2d
RadeonVII: 910 us/sq- ETA 22h 35m

Last fiddled with by SELROC on 2019-07-08 at 04:30
  Reply With Quote
Old 2019-07-08, 16:50   #1264
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

mprime can do k*b^n+c

How feasible would it be, in principle, to adapt gpuOwL to be more flexible? Wagstaff in particular: (2^p+1)/3
GP2 is offline   Reply With Quote
Old 2019-07-08, 20:10   #1265
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

376110 Posts
Default

Quote:
Originally Posted by GP2 View Post
mprime can do k*b^n+c

How feasible would it be, in principle, to adapt gpuOwL to be more flexible? Wagstaff in particular: (2^p+1)/3


Working mod 2^p+1 is almost as easy as 2^p-1. Then a final division by 3 to get mod (2^p+1)/3. E.g:

Code:
p=127;Mod(3,2^p+1)^2^p
Mod(9, 170141183460469231731687303715884105729)
Gerbicz error checking can be done too,

Last fiddled with by paulunderwood on 2019-07-08 at 20:14
paulunderwood is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 20:32.


Sun Aug 1 20:32:24 UTC 2021 up 9 days, 15:01, 0 users, load averages: 2.13, 2.23, 1.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.