![]() |
|
|
#1255 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Quote:
I've started a reference table available at https://www.mersenneforum.org/showpo...3&postcount=15 including a couple other variables too (like when nonzero offset was available in gpuowl, or Jacobi check available in the LL flavors). It's incomplete and a work in progress. I haven't tested, built, downloaded, or even identified the commits for all the 0.1 increment versions yet. Some useful versions in my opinion are: v0.5 LL with pseudorandom offset, no Jacobi check; most efficient near the upper limit of the 4M fft ~70-77M exponent; useful for helping DC past LL first tests v0.6 LL with Jacobi check for helping DC past LL first tests done with nonzero offset; most efficient near the upper limit of the 4M fft ~70-77M exponent; I think zero offset only v1.9 PRP DC, 4M is fast, limited to zero offset, type 1 residues. (2, 4, 8M; fastest times for each that I've seen in testing on RX480. Although driver updates necessary for v2.0 support that caused a 5% slowdown affected that.) v3.8 PRP, 8M for ~150M exponents is fast; type 1 residues, zero offset limitation V6.2-6.5 PRP type 4 residues, many fft lengths, and speeds I've checked are competitive with the best of the previous versions, latest and greatest, limited to zero offset, separate P-1 (which runs for some but I've had crashes with the P-1 in every attempt) Iteration timing benchmarks vs. a variety of gpuowl versions and fft lengths run on the same system and RX480 gpu are available at https://www.mersenneforum.org/showpo...35&postcount=2 Switching between versions and supporting multiple versions is easy. I have dozens on one system with 2 AMD gpus. I use a separate directory for each, shortcuts to get there, and simple batch files containing the executable name and the usual command line options (this is on Windows 7 or 10 typically). For example, g65.bat for V6.5 is Code:
gpuowl-win -device 0 -carry short -fft +0 -use ORIG_X2 :dev 0 rx480, 1 rx550 : -carry long -fft +0 -carry short -use FMA_X2 -use ORIG_X2 Last fiddled with by kriesel on 2019-06-23 at 16:41 |
|
|
|
|
|
|
#1256 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Don't know how to modify SELROC's make directions in post 1076 to do a git branch such as PRP-offset. (Attempts made, results not pretty.) https://github.com/preda/gpuowl/tree/prp-offset So, I tried building it for Windows after downloading and unzipping a zip file, and editing the makefile a bit to correspond to how I had previously built V3.8, since their commit dates are only days apart:
Code:
$ make openowl-notf g++ -O2 -DREV=\"ae3be65\" -std=c++14 OpenGpu.cpp NoTF.cpp clwrap.cpp common.cpp gpuowl.cpp -o openowl-notf -lOpenCL -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -static strip openowl-notf.exe Code:
{"exponent":1398269, "worktype":"PRP-3", "status":"P", "program":{"name":"gpuowl", "version":"3.8-ae3be65-OpenCL"}, "timestamp":"2019-06-23 18:58:54 UTC", "computer":"Ellesmere-36x1266-@28:0.0", "aid":"0", "residue-type":1, "fft-length":"512K", "res64":"0000000000000001", "errors":{"gerbicz":0}}
|
|
|
|
|
|
#1257 |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2·33·109 Posts |
I believe the following should work(untested). It is probably possible to simplify.
Code:
git clone https://github.com/preda/gpuowl git fetch --all git checkout prp-offset |
|
|
|
|
|
#1258 |
|
64708 Posts |
I don't know yet what Preda thinks about this. I have found an array index warning in gpuOwl. FFT 8K.
https://github.com/preda/gpuowl/issues/56 |
|
|
|
#1259 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
5·11·137 Posts |
The current version does not output any Gerbicz information in the JSON text. See below. Primenet thus fails to mark the PRP test as "highly reliable". BTW, the test below had several failed Gerbicz checks. The count of such failures would be useful in the JSON output.
Code:
{"exponent":"87944903", "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"v6.5-82-g77b45a4"}, "timestamp":"2019-07-03 16:55:14 UTC", "user":"gw2", "computer":"radeon2.2", "aid":"8FAA3EF0B7F73F7029BC6154D749FF2D", "fft-length":5242880, "res64":"d730c1a17c8fcd1e", "residue-type":4}
Code:
{"status":"C", "exponent":85527073, "worktype":"PRP-3", "res64":"59AC64DACB6891E4", "residue-type":1, "res2048":"E77683E0E56D070B43DAD890B2957616AE4A6EA891AC9672365B8D3725A17ADC9E82404B0DDB73D9827F2DA3442BE9D111A230DAB332BF7F120A16127AF22768AC2B7A34EA260A772618F53D7D8645CEE444F63F30D95CB453289B3761C05CC67C736A31B99FB65980B48A36A7BAEAEEA354984B2FD8ABE6D664B7B0ADD2005652E8B207FF2E8673804AB8E1DC27A679C760AC9256070F4BAD18A250E52E4FD17A592534D80EEA858B8E69D000CB32A6455E111D3F11576DD30FECE328DD397EF63121DFA6447EA7BF5091636B289192E7FD858035033133ACA6C0A08DAB00DAAAE8A8162254CCCD0B7B69888D19CE66F1E48C6C9013865F59AC64DACB6891E4", "fft-length":4718592, "shift-count":9773447, "error-code":"00000000", "security-code":"728605AD", "program":{"name":"Prime95", "version":"29.7", "build":1, "port":8}, "timestamp":"2019-06-12 23:46:20", "errors":{"gerbicz":0}, "user":"gw_2", "computer":"h110itx1", "aid":"FAFE04EE26AE5DB345E585E8913E1C75"}
LaurV: edited to wrap code tags around the json files, they created a mess on screen due to long, unterminated lines ("beautify"-ing it in your editor, if you use pn or n++ or else, may help, before posting, so we can see it nicely indented )
Last fiddled with by LaurV on 2019-07-06 at 03:00 Reason: as explained |
|
|
|
|
|
#1260 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
5×11×137 Posts |
|
|
|
|
|
|
#1261 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010112 Posts |
Quote:
Code:
2019-03-03 06:58:57 condorella-rx550 {"exponent":83411351, "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"3.8-91c52fa-OpenCL"}, "timestamp":"2019-03-03 12:58:57 UTC", "user":"kriesel", "computer":"condorella-rx550", "aid":"redacted", "residue-type":1, "fft-length":"4608K", "res64":"redacted", "errors":{"gerbicz":1}}
Last fiddled with by kriesel on 2019-07-03 at 22:53 |
|
|
|
|
|
|
#1262 |
|
3·2,659 Posts |
Note: I have a script that quickly recovers after a power loss.
https://github.com/valeriob01/Mersen...7ef4ec01174485 |
|
|
|
#1263 |
|
2×3×13×97 Posts |
Current gpuowl typical performance numbers for 89M exponents:
RX580: 3849 us/sq - ETA 4d 2h Vega64: 2010 us/sq - ETA 2d RadeonVII: 910 us/sq- ETA 22h 35m Last fiddled with by SELROC on 2019-07-08 at 04:30 |
|
|
|
#1264 |
|
Sep 2003
5×11×47 Posts |
mprime can do k*b^n+c
How feasible would it be, in principle, to adapt gpuOwL to be more flexible? Wagstaff in particular: (2^p+1)/3 |
|
|
|
|
|
#1265 | |
|
Sep 2002
Database er0rr
376110 Posts |
Quote:
![]() Working mod 2^p+1 is almost as easy as 2^p-1. Then a final division by 3 to get mod (2^p+1)/3. E.g: Code:
p=127;Mod(3,2^p+1)^2^p Mod(9, 170141183460469231731687303715884105729) Last fiddled with by paulunderwood on 2019-07-08 at 20:14 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |