![]() |
[QUOTE=Prime95;519608]I thought Mihai had agreed to make type-1 residues gpuowl's default. However, it is still producing type-4.[/QUOTE]Gpuowl implemented LL through v0.6, PRP with residue type 4 initially (v0.7 to at least 1.1), switched to type 1 by v1.5 and continued it to at least v3.9, then back to type 4 when PRP-1 was added in v4; when P-1 was separated in v6.0, PRP3 remained type 4 through at least v6.5.
I've started a reference table available at [URL]https://www.mersenneforum.org/showpost.php?p=519603&postcount=15[/URL] including a couple other variables too (like when nonzero offset was available in gpuowl, or Jacobi check available in the LL flavors). It's incomplete and a work in progress. I haven't tested, built, downloaded, or even identified the commits for all the 0.1 increment versions yet. Some useful versions in my opinion are: v0.5 LL with pseudorandom offset, no Jacobi check; most efficient near the upper limit of the 4M fft ~70-77M exponent; useful for helping DC past LL first tests v0.6 LL with Jacobi check for helping DC past LL first tests done with nonzero offset; most efficient near the upper limit of the 4M fft ~70-77M exponent; I think zero offset only v1.9 PRP DC, 4M is fast, limited to zero offset, type 1 residues. (2, 4, 8M; fastest times for each that I've seen in testing on RX480. Although driver updates necessary for v2.0 support that caused a 5% slowdown affected that.) v3.8 PRP, 8M for ~150M exponents is fast; type 1 residues, zero offset limitation V6.2-6.5 PRP type 4 residues, many fft lengths, and speeds I've checked are competitive with the best of the previous versions, latest and greatest, limited to zero offset, separate P-1 (which runs for some but I've had crashes with the P-1 in every attempt) Iteration timing benchmarks vs. a variety of gpuowl versions and fft lengths run on the same system and RX480 gpu are available at [URL]https://www.mersenneforum.org/showpost.php?p=488535&postcount=2[/URL] Switching between versions and supporting multiple versions is easy. I have dozens on one system with 2 AMD gpus. I use a separate directory for each, shortcuts to get there, and simple batch files containing the executable name and the usual command line options (this is on Windows 7 or 10 typically). For example, g65.bat for V6.5 is [CODE]gpuowl-win -device 0 -carry short -fft +0 -use ORIG_X2 :dev 0 rx480, 1 rx550 : -carry long -fft +0 -carry short -use FMA_X2 -use ORIG_X2[/CODE]I find it handy to have a reminder in comments there which gpu model is which device number, on each system, especially for 3 or more per system, and to have different options there in comments for fast convenient copy/paste into the command in line one. |
PRP offset branch
Don't know how to modify SELROC's make directions in post 1076 to do a git branch such as PRP-offset. (Attempts made, results not pretty.) [URL]https://github.com/preda/gpuowl/tree/prp-offset[/URL] So, I tried building it for Windows after downloading and unzipping a zip file, and editing the makefile a bit to correspond to how I had previously built V3.8, since their commit dates are only days apart:[CODE]$ make openowl-notf
g++ -O2 -DREV=\"ae3be65\" -std=c++14 OpenGpu.cpp NoTF.cpp clwrap.cpp common.cpp gpuowl.cpp -o openowl-notf -lOpenCL -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -static strip openowl-notf.exe [/CODE]It compiles. It runs, apparently correctly. But there's no indication of a nonzero offset, in console output, gpuowl.log, help output, or results. [CODE]{"exponent":1398269, "worktype":"PRP-3", "status":"P", "program":{"name":"gpuowl", "version":"3.8-ae3be65-OpenCL"}, "timestamp":"2019-06-23 18:58:54 UTC", "computer":"Ellesmere-36x1266-@28:0.0", "aid":"0", "residue-type":1, "fft-length":"512K", "res64":"0000000000000001", "errors":{"gerbicz":0}} [/CODE] |
I believe the following should work(untested). It is probably possible to simplify.
[CODE]git clone https://github.com/preda/gpuowl git fetch --all git checkout prp-offset[/CODE] |
I don't know yet what Preda thinks about this. I have found an array index warning in gpuOwl. FFT 8K.
[url]https://github.com/preda/gpuowl/issues/56[/url] |
The current version does not output any Gerbicz information in the JSON text. See below. Primenet thus fails to mark the PRP test as "highly reliable". BTW, the test below had several failed Gerbicz checks. The count of such failures would be useful in the JSON output.
[CODE] {"exponent":"87944903", "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"v6.5-82-g77b45a4"}, "timestamp":"2019-07-03 16:55:14 UTC", "user":"gw2", "computer":"radeon2.2", "aid":"8FAA3EF0B7F73F7029BC6154D749FF2D", "fft-length":5242880, "res64":"d730c1a17c8fcd1e", "residue-type":4} [/CODE] For comparison, a prime95 JSON output: [CODE]{"status":"C", "exponent":85527073, "worktype":"PRP-3", "res64":"59AC64DACB6891E4", "residue-type":1, "res2048":"E77683E0E56D070B43DAD890B2957616AE4A6EA891AC9672365B8D3725A17ADC9E82404B0DDB73D9827F2DA3442BE9D111A230DAB332BF7F120A16127AF22768AC2B7A34EA260A772618F53D7D8645CEE444F63F30D95CB453289B3761C05CC67C736A31B99FB65980B48A36A7BAEAEEA354984B2FD8ABE6D664B7B0ADD2005652E8B207FF2E8673804AB8E1DC27A679C760AC9256070F4BAD18A250E52E4FD17A592534D80EEA858B8E69D000CB32A6455E111D3F11576DD30FECE328DD397EF63121DFA6447EA7BF5091636B289192E7FD858035033133ACA6C0A08DAB00DAAAE8A8162254CCCD0B7B69888D19CE66F1E48C6C9013865F59AC64DACB6891E4", "fft-length":4718592, "shift-count":9773447, "error-code":"00000000", "security-code":"728605AD", "program":{"name":"Prime95", "version":"29.7", "build":1, "port":8}, "timestamp":"2019-06-12 23:46:20", "errors":{"gerbicz":0}, "user":"gw_2", "computer":"h110itx1", "aid":"FAFE04EE26AE5DB345E585E8913E1C75"}[/CODE] LaurV: edited to wrap code tags around the json files, they created a mess on screen due to long, unterminated lines ("beautify"-ing it in your editor, if you use pn or n++ or else, may help, before posting, so we can see it nicely indented :razz:) |
[QUOTE=SELROC;520435]I don't know yet what Preda thinks about this. I have found an array index warning in gpuOwl. FFT 8K.[/QUOTE]
Since preda has not answered yet, the warning is harmless. I'm sure he'll fix it when he has the time. |
[QUOTE=Prime95;520693]The current version does not output any Gerbicz information in the JSON text. See below. Primenet thus fails to mark the PRP test as "highly reliable". BTW, the test below had several failed Gerbicz checks. The count of such failures would be useful in the JSON output.
{"exponent":"87944903", "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"v6.5-82-g77b45a4"}, "timestamp":"2019-07-03 16:55:14 UTC", "user":"gw2", "computer":"radeon2.2", "aid":"8FAA3EF0B7F73F7029BC6154D749FF2D", "fft-length":5242880, "res64":"d730c1a17c8fcd1e", "residue-type":4}[/QUOTE]Confirmed here (and also the case for v5.0-9c13870 or V4.3). But earlier versions did. For example, V1.9, V3.8 (redacted result, with one EE occurrence on Gerbicz check, so it resumed from an earlier saved residue and repeated the Gerbicz block, successfully on the second attempt) [CODE]2019-03-03 06:58:57 condorella-rx550 {"exponent":83411351, "worktype":"PRP-3", "status":"C", "program":{"name":"gpuowl", "version":"3.8-91c52fa-OpenCL"}, "timestamp":"2019-03-03 12:58:57 UTC", "user":"kriesel", "computer":"condorella-rx550", "aid":"redacted", "residue-type":1, "fft-length":"4608K", "res64":"redacted", "errors":{"gerbicz":1}}[/CODE] and V3.9. |
Note: I have a script that quickly recovers after a power loss.
[url]https://github.com/valeriob01/Mersenne-gpu-computing-node/commit/e90de7d656c60ddbb9eac294977ef4ec01174485[/url] |
Current gpuowl typical performance numbers for 89M exponents:
RX580: 3849 us/sq - ETA 4d 2h Vega64: 2010 us/sq - ETA 2d RadeonVII: 910 us/sq- ETA 22h 35m |
mprime can do k*b^n+c
How feasible would it be, in principle, to adapt gpuOwL to be more flexible? Wagstaff in particular: (2^p+1)/3 |
[QUOTE=GP2;521019]mprime can do k*b^n+c
How feasible would it be, in principle, to adapt gpuOwL to be more flexible? Wagstaff in particular: (2^p+1)/3[/QUOTE] :goodposting: Working mod 2^p+1 is almost as easy as 2^p-1. Then a final division by 3 to get mod (2^p+1)/3. E.g: [code] p=127;Mod(3,2^p+1)^2^p Mod(9, 170141183460469231731687303715884105729) [/code] Gerbicz error checking can be done too, |
| All times are UTC. The time now is 23:14. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.