mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

preda 2018-07-17 22:26

[QUOTE=SELROC;492013]
1) I have tried -block 200 but it seems stubborn to blockSize 400.

2) I have noted that for the 300M the best FFT is 18M even it is slower, with 16M the ETA was going back in time by a considerable amount of time (hours), which I suppose means a lot of retries have been done, but no error has been reported, I mean no EE but the timing was varying considerably.[/QUOTE]
"block size" can only be changed when an exponent is started for the first time. Once ongoing, the block is fixed to completion.

If there are any errors, you should see EE lines, plus a number like "(3 errors)" on every line. But the check is done every 160K iterations, thus not very often. The time variations you see most likely come from tiny variations in iteration time, multiplied by the large number of remaining iterations.

SELROC 2018-07-18 06:54

[QUOTE=preda;492019]"block size" can only be changed when an exponent is started for the first time. Once ongoing, the block is fixed to completion.
[/QUOTE]

I see, so I should restart that exponent

[QUOTE]If there are any errors, you should see EE lines, plus a number like "(3 errors)" on every line. But the check is done every 160K iterations, thus not very often. The time variations you see most likely come from tiny variations in iteration time, multiplied by the large number of remaining iterations.[/QUOTE]

No I didn't see any error, so maybe, ETA fluctuations are a normal thing and I should care very little about them.

SELROC 2018-07-20 19:07

[QUOTE=preda;491952]Pushing the GPU fan up a bit:
[CODE]
amdgpu-pci-6700
Adapter: PCI adapter
vddgfx: +1.06 V
fan1: 3276 RPM
temp1: +70.0°C (crit = +89.0°C, hyst = -273.1°C)
power1: 206.00 W (cap = 220.00 W)
[/CODE]
I get just under 9ms/it for "100M digits" exponents; Vega64, 205W, temperature 70C.
[CODE]
vega0 16570000/332193109 [ 4.99%], 8.95 ms/it [8.94, 8.95]; ETA 32d 16:31; f6b94760b829ddec
[/CODE]This is with the amdgpu-pro 18.20 driver. There is hope that ROCm may be a bit better still (when I can install it).[/QUOTE]

on Radeon RX580:

OK 2018-07-20 20:56:51 0 800/332412937 [ 0.00%], 16.44 ms/it [16.41, 16.48]; ETA 63d 06:15; d83282a03d036688 (check 9.09s) (saved)

SELROC 2018-07-21 09:22

[QUOTE=SELROC;492204]on Radeon RX580:

OK 2018-07-20 20:56:51 0 800/332412937 [ 0.00%], 16.44 ms/it [16.41, 16.48]; ETA 63d 06:15; d83282a03d036688 (check 9.09s) (saved)[/QUOTE]




OK 2018-07-21 11:19:09 1 800/76693901 [ 0.00%], 3.10 ms/it [3.09, 3.11]; ETA 2d 18:02; eac585079aeaa455 (check 1.80s) (saved)

SELROC 2018-07-22 05:55

Error:




gpuowl-OpenCL 3.4--mod
FFT 512K: Width 512 (64x8), Height 512 (64x8); 0.00 bits/word
Note: using long carry kernels
Ellesmere-36x1360-[URL="https://github.com/A"]@A[/URL]:0.0 Radeon RX 580 Series
OpenCL compilation in 952 ms, with " -DEXP=521u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=1u -I. -cl-fast-relaxed-math "
[2018-07-21 17:23:47 CEST] PRP M(521), FFT 512K, 0.00 bits/word
openowl: LowGpu.h:67: ....... failed.
Aborted

preda 2018-07-22 11:33

[QUOTE=SELROC;492275]Error:
gpuowl-OpenCL 3.4--mod
FFT 512K: Width 512 (64x8), Height 512 (64x8); 0.00 bits/word
Note: using long carry kernels
Ellesmere-36x1360-[URL="https://github.com/A"]@A[/URL]:0.0 Radeon RX 580 Series
OpenCL compilation in 952 ms, with " -DEXP=521u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=1u -I. -cl-fast-relaxed-math "
[2018-07-21 17:23:47 CEST] PRP M(521), FFT 512K, 0.00 bits/word
openowl: LowGpu.h:67: ....... failed.
Aborted[/QUOTE]

it seems you have an exponent of 521, which is too small. (hint in the 0.00 bits/word)

SELROC 2018-07-22 11:48

[QUOTE=preda;492288]it seems you have an exponent of 521, which is too small. (hint in the 0.00 bits/word)[/QUOTE]


Yes. I wanted to see what was the output with a prime, but I think I have chosen a too small one. So just for test, what is the smallest exponent ?

SELROC 2018-07-22 12:47

[QUOTE=preda;492288]it seems you have an exponent of 521, which is too small. (hint in the 0.00 bits/word)[/QUOTE]


Thank you, tested known mersenne prime 1257787


[url]https://www.mersenne.org/report_exponent/?exp_lo=1257787[/url]


***


gpuowl-OpenCL 3.4--mod
FFT 512K: Width 512 (64x8), Height 512 (64x8); 2.40 bits/word
Note: using long carry kernels
Ellesmere-36x1360-@a:0.0 Radeon RX 580 Series
OpenCL compilation in 951 ms, with " -DEXP=1257787u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=1u -I. -cl-fast-relaxed-math "
[2018-07-22 13:58:59 CEST] PRP M(1257787), FFT 512K, 2.40 bits/word
OK loaded: 0/1257787, blockSize 400, 0000000000000003
OK initial check: 0000000000000003
OK 2018-07-22 13:59:04 5 800/1257787 [ 0.06%], 2.14 ms/it [2.14, 2.15]; ETA 0d 00:45; a0cb6d4276e3bf46 (check 0.93s) (saved)
OK 2018-07-22 14:04:47 5 160000/1257787 [12.72%], 2.15 ms/it [2.14, 2.15]; ETA 0d 00:39; 940b346c388557b6 (check 0.93s) (saved)
OK 2018-07-22 14:10:31 5 320000/1257787 [25.44%], 2.14 ms/it [2.14, 2.14]; ETA 0d 00:34; 5fd48808861cb91c (check 0.93s) (saved)
OK 2018-07-22 14:16:16 5 480000/1257787 [38.16%], 2.15 ms/it [2.14, 2.15]; ETA 0d 00:28; a3bb5fbfc26254b3 (check 0.93s) (saved)
OK 2018-07-22 14:22:01 5 640000/1257787 [50.87%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:22; b9c3ee7234803894 (check 0.93s) (saved)
OK 2018-07-22 14:27:46 5 800000/1257787 [63.59%], 2.15 ms/it [2.15, 2.16]; ETA 0d 00:16; 27e2b993451c6138 (check 0.93s) (saved)
OK 2018-07-22 14:33:31 5 960000/1257787 [76.31%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:11; 609355f6740c820a (check 0.93s) (saved)
OK 2018-07-22 14:39:16 5 1120000/1257787 [89.03%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:05; 5a404a29545347a7 (check 0.93s) (saved)
PP 1257787 / 1257787, 0000000000000001 (raw 0000000000000009)
OK 2018-07-22 14:44:13 5 1258000/1257787 [100.00%], 2.15 ms/it [2.15, 2.23]; ETA 0d 00:00; f4d273818ecfa167 (check 0.86s)
{"exponent":1257787, "worktype":"PRP-3", "status":"P", "residue-type":1, "fft-length":"512K", "res64":"0000000000000001", "program":{"name":"gpuowl", "version":"3.4--mod-OpenCL"}, "timestamp":"2018-07-22 12:44:13 UTC", "errors":{"gerbicz":0}, "user":"selroc", "computer":"5"}

Bye

ET_ 2018-07-22 13:14

[QUOTE=SELROC;492291]Thank you, tested known mersenne prime 1257787


[url]https://www.mersenne.org/report_exponent/?exp_lo=1257787[/url]


***


gpuowl-OpenCL 3.4--mod
FFT 512K: Width 512 (64x8), Height 512 (64x8); 2.40 bits/word
Note: using long carry kernels
Ellesmere-36x1360-@a:0.0 Radeon RX 580 Series
OpenCL compilation in 951 ms, with " -DEXP=1257787u -DWIDTH=512u -DSMALL_HEIGHT=512u -DMIDDLE=1u -I. -cl-fast-relaxed-math "
[2018-07-22 13:58:59 CEST] PRP M(1257787), FFT 512K, 2.40 bits/word
OK loaded: 0/1257787, blockSize 400, 0000000000000003
OK initial check: 0000000000000003
OK 2018-07-22 13:59:04 5 800/1257787 [ 0.06%], 2.14 ms/it [2.14, 2.15]; ETA 0d 00:45; a0cb6d4276e3bf46 (check 0.93s) (saved)
OK 2018-07-22 14:04:47 5 160000/1257787 [12.72%], 2.15 ms/it [2.14, 2.15]; ETA 0d 00:39; 940b346c388557b6 (check 0.93s) (saved)
OK 2018-07-22 14:10:31 5 320000/1257787 [25.44%], 2.14 ms/it [2.14, 2.14]; ETA 0d 00:34; 5fd48808861cb91c (check 0.93s) (saved)
OK 2018-07-22 14:16:16 5 480000/1257787 [38.16%], 2.15 ms/it [2.14, 2.15]; ETA 0d 00:28; a3bb5fbfc26254b3 (check 0.93s) (saved)
OK 2018-07-22 14:22:01 5 640000/1257787 [50.87%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:22; b9c3ee7234803894 (check 0.93s) (saved)
OK 2018-07-22 14:27:46 5 800000/1257787 [63.59%], 2.15 ms/it [2.15, 2.16]; ETA 0d 00:16; 27e2b993451c6138 (check 0.93s) (saved)
OK 2018-07-22 14:33:31 5 960000/1257787 [76.31%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:11; 609355f6740c820a (check 0.93s) (saved)
OK 2018-07-22 14:39:16 5 1120000/1257787 [89.03%], 2.15 ms/it [2.15, 2.15]; ETA 0d 00:05; 5a404a29545347a7 (check 0.93s) (saved)
PP 1257787 / 1257787, 0000000000000001 (raw 0000000000000009)
OK 2018-07-22 14:44:13 5 1258000/1257787 [100.00%], 2.15 ms/it [2.15, 2.23]; ETA 0d 00:00; f4d273818ecfa167 (check 0.86s)
{"exponent":1257787, "worktype":"PRP-3", "status":"P", "residue-type":1, "fft-length":"512K", "res64":"0000000000000001", "program":{"name":"gpuowl", "version":"3.4--mod-OpenCL"}, "timestamp":"2018-07-22 12:44:13 UTC", "errors":{"gerbicz":0}, "user":"selroc", "computer":"5"}

Bye[/QUOTE]

Shouldn't RES64: be 0?

axn 2018-07-22 13:19

[QUOTE=ET_;492293]Shouldn't RES64: be 0?[/QUOTE]

For LL, yes. For PRP, no.

preda 2018-07-27 11:13

On Ubuntu 18.04 (kernel 4.15), I finally managed to install ROCm 1.8.2; (previously I was using amdgpu-pro 18.20). The performance for "100M digits" improved from 9.17ms/it to 8.36ms/it (on otherwise same setup).

Warning for others trying ROCm 1.8.2, there may be new errors when running on wavefront exponents (80M). (if there are, they are detected correctly, the result is still reliable). I need to keep an eye on these, I'm not sure yet who's to blame (me or ROCm).


All times are UTC. The time now is 23:02.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.