mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

preda 2020-05-22 03:20

[QUOTE=kriesel;546172]So, totals,
[CODE] 677277 130312 101680 2181
83.86% 16.14% 97.90% 2.10%
LL PRP LL-DC PRP-DC[/CODE]Not enough PRP-DC for statistics yet.[/QUOTE]

I count about 3000 PRP-DC, that's enough for some statistics. What is the error count in those PRP-DC?

kriesel 2020-05-22 03:36

[QUOTE=Prime95;546145]
Can you try getting past the bad iteration with "ULTRA_TRIG=1"?
Can you get past with "MM_CHAIN=3?[/QUOTE]
MM_CHAIN=3 not tried; ULTRA_TRIG handled both 154M stops seen earlier for 8M fft.

[CODE]2020-05-21 22:16:45 gpuowl v6.11-288-g20c4213
2020-05-21 22:16:45 config: -user kriesel -cpu asr2/radeonvii2 -d 2 -use NO_ASM,ULTRA_TRIG=1 -maxAlloc 15000
2020-05-21 22:16:45 device 2, unique id ''
2020-05-21 22:16:45 asr2/radeonvii2 154155713 FFT: 8M 1K:8:512 (18.38 bpw)
2020-05-21 22:16:45 asr2/radeonvii2 Expected maximum carry32: 70B40000
2020-05-21 22:16:49 asr2/radeonvii2 OpenCL args "-DEXP=154155713u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=8u -DWEIGHT_STEP=0xc.528658a63b438p-3 -DIWEIGHT_STEP=0xa.633a
f6ee9bb58p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DPM1=0 -DAMDGPU=1 -DCARRY64=1 -DMM_CHAIN=2u -DMM2_CHAIN=3u -DNO_ASM=1 -DULTRA_T
RIG=1 -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-05-21 22:16:59 asr2/radeonvii2 OpenCL compilation in 9.73 s
2020-05-21 22:17:00 asr2/radeonvii2 154155713 LL 87952000 loaded: 64a16470a62c62a6
2020-05-21 22:18:07 asr2/radeonvii2 154155713 LL 88000000 57.09%; 1404 us/it; ETA 1d 01:48; 503baed50ef43276
2020-05-21 22:21:34 asr2/radeonvii2 154155713 LL 88100000 57.15%; [U]2065[/U] us/it; ETA 1d 13:53; 109820c89c99e3ce
2020-05-21 22:21:34 asr2/radeonvii2 154155713 OK 88000000 (jacobi == -1)
2020-05-21 22:23:54 asr2/radeonvii2 154155713 LL 88200000 57.21%; 1403 us/it; ETA 1d 01:42; 6669371a0075ba93
2020-05-21 22:26:14 asr2/radeonvii2 154155713 LL 88300000 57.28%; 1402 us/it; ETA 1d 01:38; 4245f339fae6d84a
2020-05-21 22:27:13 asr2/radeonvii2 Stopping, please wait..
2020-05-21 22:27:13 asr2/radeonvii2 154155713 LL 88342000 57.31%; 1406 us/it; ETA 1d 01:42; ad659b2e1c5725b0
2020-05-21 22:27:13 asr2/radeonvii2 waiting for the Jacobi check to finish..
2020-05-21 22:29:29 asr2/radeonvii2 154155713 OK 88342000 (jacobi == -1)
2020-05-21 22:29:29 asr2/radeonvii2 Exiting because "stop requested"
2020-05-21 22:29:29 asr2/radeonvii2 Bye[/CODE]The underlined timing is anomalous and due to my scrolling the console window.

ATH 2020-05-22 03:49

I can find 19 "PRP Bad" which is a lot more than I thought, but it is only 4 different users:

[M]77230663[/M]
[M]78410041[/M]
[M]79062083[/M]
[M]79075979[/M]
[M]79078529[/M]
[M]79087627[/M]
[M]79109357[/M]
[M]79143199[/M]
[M]81062081[/M]
[M]81069221[/M]
[M]81085727[/M]
[M]81705373[/M]
[M]82054421[/M]
[M]82561769[/M]
[M]82821223[/M]
[M]83018863[/M]
[M]83404301[/M]
[M]83509891[/M]
[M]86670349[/M]


First time PRP tests are called "[B]PRP Unverified (Reliable)[/B]", anyone know what they are called if they are not reliable? ""PRP Unverified (Unreliable)"? "PRP Unverified (Suspect)"? I cannot find any other version than Reliable for the 1st time PRP tests.


Edit: I found "[B]PRP Unverified[/B]" without any parentheses, does that mean they are not reliable? I hope not because there are thousands of those, I guess it is an older way to write the result?

kriesel 2020-05-22 04:14

[QUOTE=preda;546174]I count about 3000 PRP-DC, that's enough for some statistics. What is the error count in those PRP-DC?[/QUOTE]
Totals I posted are from a spreadsheet summation of what ATH posted, 50M thru 110M exponents. Copy/paste; create sums. Copy/paste results to thread.

19 / 2181 from ATH's posts is a much higher PRP error rate than expected.

This one was reported in with an error count of 20 from a flaky Radeon VII running gpuowl. [M]655685803[/M] and it's called "Unverified (Reliable)"

preda 2020-05-22 07:26

[QUOTE=ATH;546176]I can find 19 "PRP Bad" which is a lot more than I thought, but it is only 4 different users:

[M]77230663[/M]
[M]78410041[/M]
[M]79062083[/M]
[M]79075979[/M]
[M]79078529[/M]
[M]79087627[/M]
[M]79109357[/M]
[M]79143199[/M]
[M]81062081[/M]
[M]81069221[/M]
[M]81085727[/M]
[M]81705373[/M]
[M]82054421[/M]
[M]82561769[/M]
[M]82821223[/M]
[M]83018863[/M]
[M]83404301[/M]
[M]83509891[/M]
[M]86670349[/M]


First time PRP tests are called "[B]PRP Unverified (Reliable)[/B]", anyone know what they are called if they are not reliable? ""PRP Unverified (Unreliable)"? "PRP Unverified (Suspect)"? I cannot find any other version than Reliable for the 1st time PRP tests.


Edit: I found "[B]PRP Unverified[/B]" without any parentheses, does that mean they are not reliable? I hope not because there are thousands of those, I guess it is an older way to write the result?[/QUOTE]

Thanks for the list!

It would be useful to have the software&version of all the bad PRP results. (it would be good for the software to be directly displayed in the "exponent status" table). Right now I know that when offset!=0 the software is most likely mprime, for sure not gpuowl. When exponent==0 I don't know -- was there an early version of mprime/PRP that was producing exponent==0? (otherwise it's gpuowl).

It seems to me that all bad results with offset!=0 originate from a single user "Sir Rutherford J. Pinkerton III" (many of them). This raises the question, did mprime generate the bad results, which version, do we understand why it happened? (i.e. was there a bug affecting that version). Why are all of them from this single user? (if it was a bug, why nobody else was affected).

For the bad results with offset==0, I would like to know the software and version. Was it gpuowl producing those? that would be a bit of a surprise to me, but better to know if there's a bug vs. blissful ignorance.

Bad results with non-zero offset, *all* originating from Sir Rutherford J. Pinkerton III
[QUOTE]
86670349 2019-01-08
79143199 2019-01-24
79109357 2019-01-19
79087627 2019-01-18
79078529 2019-01-16
79075979 2019-01-15
79062083 2019-01-09
78410041 2019-02-09
[/QUOTE]

Bad results with zero offset:
[QUOTE]
83509891 2018-04-16 Milwizzle
83404301 2018-04-03 Milwizzle
83018863 2018-03-14 Milwizzle
82821223 2018-03-05 Milwizzle
82561769 2018-02-23 Milwizzle
82054421 2018-02-05 Milwizzle
81705373 2018-05-12 Milwizzle
81085727 2018-08-22 George Woltman
81069221 2018-08-20 George Woltman
81062081 2018-08-20 George Woltman
77230663 2019-09-29 nokno
[/QUOTE]

R. Gerbicz 2020-05-22 07:28

[QUOTE=ATH;546176]I can find 19 "PRP Bad" which is a lot more than I thought, but it is only 4 different users:
[/QUOTE]

Look also the bad(!) residue's date, as I can see only one of them came in March 2019 or later: [url]https://www.mersenne.org/report_exponent/?exp_lo=77230663&full=1[/url] .

And what happened in February: [url]https://mersenneforum.org/showthread.php?p=508163#post508163[/url]
With a proper implementation of error checked all iterations you should never see an error.

PhilF 2020-05-22 14:11

[QUOTE=ATH;546176]Edit: I found "[B]PRP Unverified[/B]" without any parentheses, does that mean they are not reliable? I hope not because there are thousands of those, I guess it is an older way to write the result?[/QUOTE]

I have noticed my older CPU-based PRP tests are listed as (Reliable). But my more recent GPU-based PRP tests (using gpuOwl and a shift of 0) and submitted manually, are simply listed as Unverified.

kriesel 2020-05-22 14:26

Any ideas what's up with this one? It used to be fine in 6.11-257, 5M fft[CODE]
2020-04-21 16:06:06 roa/rx550 94955227 OK 45000000 47.39%; 14196 us/it; ETA 8d 04:59; cf193f810b74508f (check 5.90s)
2020-04-21 16:53:31 roa/rx550 94955227 OK 45200000 47.60%; 14195 us/it; ETA 8d 04:11; 5e5d2f79fcd4c510 (check 5.90s)
2020-04-21 17:40:56 roa/rx550 94955227 OK 45400000 47.81%; 14195 us/it; ETA 8d 03:24; d5fd846bbdc9b33b (check 5.92s)
2020-04-21 18:28:21 roa/rx550 94955227 OK 45600000 48.02%; 14195 us/it; ETA 8d 02:37; 6ce4060b3179317f (check 5.90s)
2020-04-21 19:15:46 roa/rx550 94955227 OK 45800000 48.23%; 14195 us/it; ETA 8d 01:49; ae609bd1a5f3fea0 (check 5.92s)
2020-04-21 20:03:11 roa/rx550 94955227 OK 46000000 48.44%; 14195 us/it; ETA 8d 01:02; cc6f06c61df1792d (check 5.92s)
2020-04-21 20:50:36 roa/rx550 94955227 OK 46200000 48.65%; 14195 us/it; ETA 8d 00:15; bab3f41612444053 (check 5.90s)
2020-04-21 21:38:00 roa/rx550 94955227 OK 46400000 48.86%; 14194 us/it; ETA 7d 23:27; f02f6569aecceee9 (check 5.90s)
2020-04-21 22:25:25 roa/rx550 94955227 OK 46600000 49.08%; 14195 us/it; ETA 7d 22:41; 28ae28232765506f (check 5.91s)
2020-04-21 23:12:50 roa/rx550 94955227 OK 46800000 49.29%; 14194 us/it; ETA 7d 21:52; d9dd697f81777087 (check 5.92s)
2020-04-22 00:00:15 roa/rx550 94955227 OK 47000000 49.50%; 14195 us/it; ETA 7d 21:06; c89bd4f70a72fa7a (check 5.93s)
[/CODE]Maybe the RX550 2GB gpu got damaged somewhat? System went back to the seller after failure to boot April 22. 2 motherboard VRMs and cpu bad. Upon return after repair, still v6.11-257:[CODE]2020-05-18 20:20:59 roa/rx550 94955227 FFT: 5M 1K:10:256 (18.11 bpw)
2020-05-18 20:20:59 roa/rx550 Expected maximum carry32: 48210000
2020-05-18 20:21:01 roa/rx550 OpenCL args "-DEXP=94955227u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xe.cff594044f17p-3 -DIWEIGHT_STEP=0x8.a4359b7df0158p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-05-18 20:21:12 roa/rx550 OpenCL compilation in 11.02 s
2020-05-18 20:21:20 roa/rx550 94955227 OK 47000000 loaded: blockSize 400, c89bd4f70a72fa7a
2020-05-18 20:21:37 roa/rx550 94955227 OK 47000800 49.50%; 14000 us/it; ETA 7d 18:29; dc7a27da26e7abc8 (check 5.83s)
2020-05-18 20:36:09 roa/rx550 Stopping, please wait..
2020-05-18 20:36:20 roa/rx550 94955227 EE 47063200 49.56%; 14070 us/it; ETA 7d 19:11; d7d7ac6572a3fc43 (check 5.83s)
2020-05-18 20:36:27 roa/rx550 94955227 OK 47000800 loaded: blockSize 400, dc7a27da26e7abc8
2020-05-18 20:36:27 roa/rx550 Exiting because "stop requested"
2020-05-18 20:36:27 roa/rx550 Bye
2020-05-18 20:39:01 config: -device 0 -user kriesel -cpu roa/rx550 -use NO_ASM
2020-05-18 20:39:01 config:
2020-05-18 20:39:01 config: ;UNROLL_HEIGHT,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT2,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_MIDDLE,CARRY32,ORIGINAL_METHOD,LESS_ACCURATE
2020-05-18 20:39:01 device 0, unique id ''
2020-05-18 20:39:01 roa/rx550 94955227 FFT: 5M 1K:10:256 (18.11 bpw)
2020-05-18 20:39:01 roa/rx550 Expected maximum carry32: 48210000
2020-05-18 20:39:03 roa/rx550 OpenCL args "-DEXP=94955227u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xe.cff594044f17p-3 -DIWEIGHT_STEP=0x8.a4359b7df0158p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-05-18 20:39:13 roa/rx550 OpenCL compilation in 9.63 s
2020-05-18 20:39:20 roa/rx550 94955227 EE 47000800 loaded: blockSize 400, 45c7eadb491ccd1d (expected dc7a27da26e7abc8)
2020-05-18 20:39:20 roa/rx550 Exiting because "error on load"
2020-05-18 20:39:20 roa/rx550 Bye[/CODE]Update to 6.11-288, tried larger fft, ULTRA_TRIG=1, still trouble:[CODE]
2020-05-21 21:44:18 gpuowl v6.11-288-g20c4213
2020-05-21 21:44:18 config: -d 1 -user kriesel -cpu roa/rx550 -use NO_ASM,ULTRA_TRIG=1 -maxAlloc 1500 -fft 1K:11:256
2020-05-21 21:44:18 device 1, unique id ''
2020-05-21 21:44:18 roa/rx550 94955227 FFT: 5.50M 1K:11:256 (16.46 bpw)
2020-05-21 21:44:18 roa/rx550 Expected maximum carry32: 184C0000
2020-05-21 21:44:20 roa/rx550 OpenCL args "-DEXP=94955227u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xb.97dc04c1d123p-3 -DIWEIGHT_STEP=0xb.0a7bf824a91fp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -DULTRA_TRIG=1 -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-05-21 21:44:29 roa/rx550 OpenCL compilation in 8.65 s
2020-05-21 21:44:37 roa/rx550 94955227 OK 51550000 loaded: blockSize 400, 9c860c40de33a4cf
2020-05-21 21:44:58 roa/rx550 94955227 OK 51550800 54.29%; 17109 us/it; ETA 8d 14:17; 0ce53d8d9ae59540 (check 7.07s) 24 errors
2020-05-21 21:59:05 roa/rx550 94955227 EE 51600000 54.34%; 17081 us/it; ETA 8d 13:42; f789a22e21e9407b (check 7.07s) 24 errors
2020-05-21 21:59:13 roa/rx550 94955227 OK 51550800 loaded: blockSize 400, 0ce53d8d9ae59540
2020-05-21 22:13:39 roa/rx550 94955227 OK 51600000 54.34%; 17458 us/it; ETA 8d 18:15; f789a22e21e9407b (check 7.07s) 25 errors
2020-05-21 22:28:00 roa/rx550 94955227 OK 51650000 54.39%; 17071 us/it; ETA 8d 13:21; 843af679abee43e9 (check 7.07s) 25 errors
2020-05-21 22:42:20 roa/rx550 94955227 OK 51700000 54.45%; 17071 us/it; ETA 8d 13:07; 5fe9ccd16267b99b (check 7.08s) 25 errors
2020-05-21 22:56:41 roa/rx550 94955227 OK 51750000 54.50%; 17071 us/it; ETA 8d 12:52; 208e56e41e0a40ee (check 7.09s) 25 errors
2020-05-21 23:11:02 roa/rx550 94955227 EE 51800000 54.55%; 17083 us/it; ETA 8d 12:47; 670a56bd99ed976f (check 7.05s) 25 errors
2020-05-21 23:11:10 roa/rx550 94955227 OK 51750000 loaded: blockSize 400, 208e56e41e0a40ee
2020-05-21 23:25:31 roa/rx550 94955227 EE 51800000 54.55%; 17072 us/it; ETA 8d 12:39; 670a56bd99ed976f (check 7.07s) 26 errors
2020-05-21 23:25:38 roa/rx550 94955227 OK 51750000 loaded: blockSize 400, 208e56e41e0a40ee
2020-05-21 23:39:59 roa/rx550 94955227 OK 51800000 54.55%; 17071 us/it; ETA 8d 12:39; 670a56bd99ed976f (check 7.10s) 27 errors
2020-05-21 23:54:20 roa/rx550 94955227 EE 51850000 54.60%; 17071 us/it; ETA 8d 12:24; a65c3c95ed2aec74 (check 7.06s) 27 errors
2020-05-21 23:54:27 roa/rx550 94955227 OK 51800000 loaded: blockSize 400, 670a56bd99ed976f
2020-05-22 00:08:48 roa/rx550 94955227 OK 51850000 54.60%; 17072 us/it; ETA 8d 12:25; a65c3c95ed2aec74 (check 7.07s) 28 errors
2020-05-22 00:23:09 roa/rx550 94955227 OK 51900000 54.66%; 17083 us/it; ETA 8d 12:19; 6b998180a02fdb4e (check 7.07s) 28 errors
2020-05-22 00:37:30 roa/rx550 94955227 OK 51950000 54.71%; 17071 us/it; ETA 8d 11:56; 156ed12844fefe6e (check 7.08s) 28 errors
2020-05-22 00:51:50 roa/rx550 94955227 OK 52000000 54.76%; 17071 us/it; ETA 8d 11:41; 53dfaf72c1f11375 (check 7.07s) 28 errors
2020-05-22 01:06:11 roa/rx550 94955227 EE 52050000 54.82%; 17071 us/it; ETA 8d 11:27; 5244891010efead5 (check 7.05s) 28 errors
2020-05-22 01:06:19 roa/rx550 94955227 OK 52000000 loaded: blockSize 400, 53dfaf72c1f11375
2020-05-22 01:20:39 roa/rx550 94955227 EE 52050000 54.82%; 17072 us/it; ETA 8d 11:28; 5244891010efead5 (check 7.05s) 29 errors
2020-05-22 01:20:47 roa/rx550 94955227 OK 52000000 loaded: blockSize 400, 53dfaf72c1f11375
2020-05-22 01:35:08 roa/rx550 94955227 OK 52050000 54.82%; 17072 us/it; ETA 8d 11:28; 5244891010efead5 (check 7.09s) 30 errors
2020-05-22 01:49:28 roa/rx550 94955227 EE 52100000 54.87%; 17071 us/it; ETA 8d 11:13; 9fa1556aabca1139 (check 7.05s) 30 errors
2020-05-22 01:49:36 roa/rx550 94955227 OK 52050000 loaded: blockSize 400, 5244891010efead5
2020-05-22 02:03:57 roa/rx550 94955227 OK 52100000 54.87%; 17071 us/it; ETA 8d 11:13; 9fa1556aabca1139 (check 7.09s) 31 errors
2020-05-22 02:18:18 roa/rx550 94955227 OK 52150000 54.92%; 17071 us/it; ETA 8d 10:59; 124a2bbd1257aa91 (check 7.07s) 31 errors
2020-05-22 02:32:40 roa/rx550 94955227 EE 52200000 54.97%; 17071 us/it; ETA 8d 10:45; af8a0d4f2c184d08 (check 7.05s) 31 errors
2020-05-22 02:32:48 roa/rx550 94955227 EE 52150000 loaded: blockSize 400, 10993d18dd3e57da (expected 124a2bbd1257aa91)
2020-05-22 02:32:48 roa/rx550 Exiting because "error on load"
2020-05-22 02:32:48 roa/rx550 Bye[/CODE]

kriesel 2020-05-22 15:47

gpuowl-win v6.11-292-gecab9ae build
 
2 Attachment(s)
Here it is, in the usual form.

ATH 2020-05-22 16:50

[QUOTE=PhilF;546205]I have noticed my older CPU-based PRP tests are listed as (Reliable). But my more recent GPU-based PRP tests (using gpuOwl and a shift of 0) and submitted manually, are simply listed as Unverified.[/QUOTE]

Of course the manual submissions does not have enough information if the results are reliable or not, thanks.

Prime95 2020-05-22 17:27

[QUOTE=ATH;546219]Of course the manual submissions does not have enough information if the results are reliable or not, thanks.[/QUOTE]

The "(reliable)" indicates Gerbicz checking was used.

Issues:
1) The first prime95 Gerbicz implementation had a bug which could miss an error.
2) Gpuowl JSON was not reporting "Gerbicz:0" in some versions a few months ago.
3) If you report manually with the screen output rather than the results.json.txt file then the web page will not know that prime95 used Gerbicz checking.

Summary: The PRPDC data needs to be looked at very carefully. I'll look at it myself later today.


All times are UTC. The time now is 23:03.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.