mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

preda 2018-02-09 05:15

[QUOTE=preda;479585]GpuOwl has acquired a new FFT size: 5000K.

i.e. 625 * 4096 * 2

The performance at the new FFT size is about 50% slower compared to 4M FFT. As a data-point, on my Vega64 I get 2.55 ms/it.
[/QUOTE]

Some perf tuning brings time/it from 2.55 to 2.43 (5%) improvement.

"gpuowl -time" can be used to get some profiling information for the kernels:
[QUOTE]
OK 310000 / 77936861 [ 0.40%], 2.69 ms/it [2.66, 2.74], check 2.19s; ETA 2d 09:57; 73cec4a2f7878ba4 [16:11:02]
39.4% autoConv : 1041 [ 1025, 1131] us/call x 10500 calls
34.7% carryFused : 919 [ 912, 953] us/call x 10479 calls
14.2% transposeH : 375 [ 369, 383] us/call x 10521 calls
11.3% transposeW : 298 [ 293, 322] us/call x 10542 calls
[/QUOTE]

As you see, in a normal "iteration" of the PRP test there are just 4 kernels invoked. Two of them do "real work" and take ~75% of the time, and two just transpose the data (and also multiply it with some factors) for 25% of iteration time.

GP2 2018-02-09 21:59

At what threshold does gpuOwL start using the new 5K FFT size?

Did you use it for [M]M77936861[/M]? I notice you seem to have a self-double-check assigned for this exponent, and same for several exponents in the 77.94M range.

I can double-check some 5K FFT results using mprime. The automated DC assignments are still in the 76.6M range, and were presumably done with 4K FFT.

preda 2018-02-10 01:45

[QUOTE=GP2;479680]At what threshold does gpuOwL start using the new 5K FFT size?

Did you use it for [M]M77936861[/M]? I notice you seem to have a self-double-check assigned for this exponent, and same for several exponents in the 77.94M range.

I can double-check some 5K FFT results using mprime. The automated DC assignments are still in the 76.6M range, and were presumably done with 4K FFT.[/QUOTE]

All the existing result are 4M fft. From now on I'll be using 5000K. The assignment you see is not a self-DC, but a bug where manual PRP assingments are not released on manual submit. I reported this bug already.

I'll post my first 5000K result here for DC, thanks!

preda 2018-02-10 06:55

[QUOTE=preda;479693]
I'll post my first 5000K result here for DC, thanks![/QUOTE]

This was done with the new 5000K FFT: 78330167
[url]https://www.mersenne.org/report_exponent/?exp_lo=78330167&full=1[/url]

kriesel 2018-02-10 16:01

looping on error
 
It would be useful if, when gpuOwL encounters a situation that produces errors detected with the Gerbicz check, and reduces its interval down to 1000 iterations, and still has errors, that after some limiting count of repeated successive error for the same exponent, starting iteration, and minimum iteration interval, it quit the current exponent, and went on to the next entry in the worktodo file, instead of the current behavior, which seems in V1.9 to be continual retry of the iterations that produced an error, apparently indefinitely (or at least longer than this user's patience for it). Putting the max successive identical retry count in the control of the user would be ideal.

[CODE]OK 300000 / 154000001 [ 0.19%], 27.03 ms/it [27.02, 27.30] CV 0.1%, check 16.50s; ETA 48d 02:14; 27458918ebc4797f [07:31:05]
EE 350000 / 154000001 [ 0.23%], 27.00 ms/it [26.99, 27.42] CV 0.2%, check 16.54s; ETA 48d 00:23; 0ea3c0d18b7bb6a5 [07:53:51]
OK 320000 / 154000001 [ 0.21%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.49s; ETA 48d 02:14; 1b0eb65bd9e0c4bf [08:03:08] (1 errors)
EE 340000 / 154000001 [ 0.22%], 27.01 ms/it [26.96, 27.27] CV 0.3%, check 16.21s; ETA 48d 00:48; 220e3b25ea454b27 [08:12:25] (1 errors)
OK 330000 / 154000001 [ 0.21%], 27.03 ms/it [27.02, 27.08] CV 0.1%, check 16.50s; ETA 48d 01:59; da6fa4d784199dcc [08:17:12] (2 errors)
EE 340000 / 154000001 [ 0.22%], 27.02 ms/it [27.02, 27.08] CV 0.1%, check 16.44s; ETA 48d 01:26; 220e3b25ea454b27 [08:21:58] (2 errors)
EE 340000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.58s; ETA 48d 02:06; 220e3b25ea454b27 [08:26:45] (3 errors)
EE 340000 / 154000001 [ 0.22%], 27.06 ms/it [27.02, 27.52] CV 0.4%, check 16.60s; ETA 48d 03:02; 220e3b25ea454b27 [08:31:32] (4 errors)
EE 340000 / 154000001 [ 0.22%], 27.03 ms/it [27.02, 27.11] CV 0.1%, check 16.36s; ETA 48d 01:50; 220e3b25ea454b27 [08:36:19] (5 errors)
EE 340000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.60s; ETA 48d 01:58; 220e3b25ea454b27 [08:41:06] (6 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.33s; ETA 48d 02:04; 5b972f21b3814a0b [08:43:38] (7 errors)
EE 335000 / 154000001 [ 0.22%], 27.05 ms/it [27.02, 27.11] CV 0.1%, check 16.36s; ETA 48d 02:29; 5b972f21b3814a0b [08:46:09] (8 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.27s; ETA 48d 02:04; 5b972f21b3814a0b [08:48:41] (9 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.25s; ETA 48d 02:13; 5b972f21b3814a0b [08:51:12] (10 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.25s; ETA 48d 02:21; 5b972f21b3814a0b [08:53:44] (11 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.43s; ETA 48d 02:04; 5b972f21b3814a0b [08:56:15] (12 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.46s; ETA 48d 02:13; 5b972f21b3814a0b [08:58:47] (13 errors)
EE 335000 / 154000001 [ 0.22%], 27.07 ms/it [27.02, 27.27] CV 0.3%, check 16.58s; ETA 48d 03:17; 5b972f21b3814a0b [09:01:19] (14 errors)
EE 335000 / 154000001 [ 0.22%], 27.00 ms/it [26.96, 27.08] CV 0.1%, check 16.65s; ETA 48d 00:22; 5b972f21b3814a0b [09:03:50] (15 errors)
EE 335000 / 154000001 [ 0.22%], 27.00 ms/it [26.99, 27.08] CV 0.1%, check 16.40s; ETA 48d 00:30; 5b972f21b3814a0b [09:06:22] (16 errors)
EE 335000 / 154000001 [ 0.22%], 27.03 ms/it [27.02, 27.08] CV 0.1%, check 16.25s; ETA 48d 01:48; 5b972f21b3814a0b [09:08:53] (17 errors)
EE 335000 / 154000001 [ 0.22%], 27.03 ms/it [27.02, 27.08] CV 0.1%, check 16.60s; ETA 48d 01:48; 5b972f21b3814a0b [09:11:25] (18 errors)
EE 335000 / 154000001 [ 0.22%], 27.07 ms/it [27.02, 27.30] CV 0.3%, check 16.22s; ETA 48d 03:17; 5b972f21b3814a0b [09:13:57] (19 errors)
EE 335000 / 154000001 [ 0.22%], 26.99 ms/it [26.96, 27.02] CV 0.1%, check 16.36s; ETA 47d 23:58; 5b972f21b3814a0b [09:16:28] (20 errors)
EE 335000 / 154000001 [ 0.22%], 27.00 ms/it [26.99, 27.05] CV 0.1%, check 16.29s; ETA 48d 00:23; 5b972f21b3814a0b [09:18:59] (21 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.47s; ETA 48d 02:13; 5b972f21b3814a0b [09:21:31] (22 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.44s; ETA 48d 02:04; 5b972f21b3814a0b [09:24:02] (23 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.72s; ETA 48d 02:04; 5b972f21b3814a0b [09:26:34] (24 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.27s; ETA 48d 02:04; 5b972f21b3814a0b [09:29:06] (25 errors)
EE 335000 / 154000001 [ 0.22%], 26.99 ms/it [26.99, 27.05] CV 0.1%, check 16.41s; ETA 48d 00:15; 5b972f21b3814a0b [09:31:37] (26 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.47s; ETA 48d 02:21; 5b972f21b3814a0b [09:34:09] (27 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.65s; ETA 48d 02:13; 5b972f21b3814a0b [09:36:41] (28 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.36s; ETA 48d 02:21; 5b972f21b3814a0b [09:39:12] (29 errors)
EE 335000 / 154000001 [ 0.22%], 27.00 ms/it [26.99, 27.08] CV 0.1%, check 16.58s; ETA 48d 00:30; 5b972f21b3814a0b [09:41:44] (30 errors)
EE 335000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.11] CV 0.1%, check 16.33s; ETA 48d 02:13; 5b972f21b3814a0b [09:44:15] (31 errors)
OK 332000 / 154000001 [ 0.22%], 27.02 ms/it [26.99, 27.08] CV 0.2%, check 16.49s; ETA 48d 01:18; 59bcd6a5dccc40c7 [09:45:26] (32 errors)
EE 335000 / 154000001 [ 0.22%], 27.01 ms/it [26.99, 27.05] CV 0.1%, check 16.65s; ETA 48d 00:51; 5b972f21b3814a0b [09:47:04] (32 errors)
EE 334000 / 154000001 [ 0.22%], 27.06 ms/it [27.02, 27.11] CV 0.1%, check 16.38s; ETA 48d 02:57; d6030fba97b08960 [09:48:14] (33 errors)
EE 334000 / 154000001 [ 0.22%], 27.05 ms/it [27.02, 27.11] CV 0.2%, check 16.19s; ETA 48d 02:37; d6030fba97b08960 [09:49:24] (34 errors)
EE 334000 / 154000001 [ 0.22%], 27.05 ms/it [27.02, 27.11] CV 0.2%, check 16.54s; ETA 48d 02:37; d6030fba97b08960 [09:50:35] (35 errors)
EE 334000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.18s; ETA 48d 02:16; d6030fba97b08960 [09:51:45] (36 errors)
EE 334000 / 154000001 [ 0.22%], 27.05 ms/it [27.02, 27.11] CV 0.2%, check 16.54s; ETA 48d 02:37; d6030fba97b08960 [09:52:56] (37 errors)
EE 334000 / 154000001 [ 0.22%], 27.05 ms/it [27.02, 27.11] CV 0.2%, check 16.66s; ETA 48d 02:37; d6030fba97b08960 [09:54:07] (38 errors)
EE 334000 / 154000001 [ 0.22%], 27.04 ms/it [27.02, 27.08] CV 0.1%, check 16.58s; ETA 48d 02:16; d6030fba97b08960 [09:55:17] (39 errors)
Stopping, please wait..
OK 332500 / 154000001 [ 0.22%], 27.08 ms/it; ETA 48d 03:56; e8cb32be9a443cbe [09:55:47] (40 errors)

Bye[/CODE]It also seems to get stuck at higher interval sizes than 1000. Manual stop and restart is a workaround for that. Sometimes it takes multiple restarts.

[CODE]
gpuOwL v1.9- GPU Mersenne primality checker
Radeon 500 Series 8 @f:0.0, gfx804 1203MHz

OpenCL compilation in 2531 ms, with "-I. -cl-fast-relaxed-math -cl-std=CL2.0 -DEXP=154000001u -DWIDTH=2048u -DHEIGHT=2048u -DLOG_NWORDS=23u -DFP_DP=1 -save-temps=df/DP_8M"
PRP-3: FFT 8M (2048 * 2048 * 2) of 154000001 (18.36 bits/word) [2018-01-31 10:00:10 Central Standard Time]
Starting at iteration 332500
OK 332500 / 154000001 [ 0.22%], 0.00 ms/it; ETA 0d 00:00; e8cb32be9a443cbe [10:00:28] (40 errors)
EE 333000 / 154000001 [ 0.22%], 27.30 ms/it; ETA 48d 13:24; c7b42ff699a5423f [10:00:59] (40 errors)
EE 333000 / 154000001 [ 0.22%], 27.26 ms/it; ETA 48d 11:37; c7b42ff699a5423f [10:01:30] (41 errors)
EE 333000 / 154000001 [ 0.22%], 27.32 ms/it; ETA 48d 14:00; c7b42ff699a5423f [10:02:00] (42 errors)
EE 333000 / 154000001 [ 0.22%], 27.31 ms/it; ETA 48d 13:50; c7b42ff699a5423f [10:02:31] (43 errors)
EE 333000 / 154000001 [ 0.22%], 27.29 ms/it; ETA 48d 12:58; c7b42ff699a5423f [10:03:02] (44 errors)
EE 333000 / 154000001 [ 0.22%], 27.31 ms/it; ETA 48d 13:39; c7b42ff699a5423f [10:03:32] (45 errors)[/CODE] or maybe it takes a lot of long slow interval errors to drop to a lower interval (over a dozen in this example): [CODE]OK 2400000 / 153900001 [ 1.56%], 27.21 ms/it [27.17, 27.27] CV 0.0%, check 17.24s; ETA 47d 17:01; f2ee1bf9b0a1384b [16:42:31]
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 16.97s; ETA 47d 19:23; f4f616188d6b306b [17:28:16]
OK 2420000 / 153900001 [ 1.57%], 27.28 ms/it [27.27, 27.36] CV 0.1%, check 17.28s; ETA 47d 19:49; 727561fc0dc6203b [17:37:39] (1 errors)
OK 2440000 / 153900001 [ 1.59%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 16.68s; ETA 47d 19:52; f903829b5363daa9 [17:47:01] (1 errors)
OK 2460000 / 153900001 [ 1.60%], 27.24 ms/it [27.21, 27.30] CV 0.1%, check 17.14s; ETA 47d 17:45; 88d6c65bc840aa34 [17:56:23] (1 errors)
OK 2480000 / 153900001 [ 1.61%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 17.35s; ETA 47d 19:38; a3024380ed38a2b0 [18:05:46] (1 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 16.88s; ETA 47d 19:11; f4f616188d6b306b [18:15:09] (1 errors)
EE 2500000 / 153900001 [ 1.62%], 27.21 ms/it [27.21, 27.30] CV 0.1%, check 16.75s; ETA 47d 16:28; f4f616188d6b306b [18:24:30] (2 errors)
EE 2500000 / 153900001 [ 1.62%], 27.27 ms/it [27.24, 27.33] CV 0.1%, check 16.47s; ETA 47d 18:49; f4f616188d6b306b [18:33:52] (3 errors)
EE 2500000 / 153900001 [ 1.62%], 27.27 ms/it [27.24, 27.33] CV 0.1%, check 16.69s; ETA 47d 18:47; f4f616188d6b306b [18:43:14] (4 errors)
OK 2490000 / 153900001 [ 1.62%], 27.24 ms/it [27.24, 27.30] CV 0.1%, check 17.00s; ETA 47d 17:51; 2eb96cb4c549b613 [18:48:03] (5 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.27] CV 0.1%, check 17.11s; ETA 47d 17:50; f4f616188d6b306b [18:52:53] (5 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.33] CV 0.1%, check 16.57s; ETA 47d 18:02; f4f616188d6b306b [18:57:42] (6 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.36] CV 0.1%, check 17.17s; ETA 47d 19:19; f4f616188d6b306b [19:02:32] (7 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.36] CV 0.1%, check 16.82s; ETA 47d 19:23; f4f616188d6b306b [19:07:22] (8 errors)
EE 2500000 / 153900001 [ 1.62%], 27.21 ms/it [27.21, 27.27] CV 0.1%, check 17.13s; ETA 47d 16:18; f4f616188d6b306b [19:12:11] (9 errors)
EE 2500000 / 153900001 [ 1.62%], 27.27 ms/it [27.27, 27.33] CV 0.1%, check 17.08s; ETA 47d 19:03; f4f616188d6b306b [19:17:01] (10 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.33] CV 0.1%, check 17.13s; ETA 47d 17:58; f4f616188d6b306b [19:21:50] (11 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.36] CV 0.1%, check 17.25s; ETA 47d 19:31; f4f616188d6b306b [19:26:40] (12 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.33] CV 0.1%, check 17.02s; ETA 47d 19:31; f4f616188d6b306b [19:31:30] (13 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.30] CV 0.1%, check 16.88s; ETA 47d 17:54; f4f616188d6b306b [19:36:20] (14 errors)
EE 2500000 / 153900001 [ 1.62%], 27.30 ms/it [27.27, 27.52] CV 0.2%, check 17.11s; ETA 47d 20:06; f4f616188d6b306b [19:41:10] (15 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 17.08s; ETA 47d 19:27; f4f616188d6b306b [19:46:00] (16 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 17.05s; ETA 47d 19:23; f4f616188d6b306b [19:50:50] (17 errors)
OK 2495000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.33] CV 0.1%, check 17.03s; ETA 47d 19:45; 108cb3e4f23362a4 [19:53:23] (18 errors)
EE 2500000 / 153900001 [ 1.62%], 27.24 ms/it [27.21, 27.30] CV 0.1%, check 16.75s; ETA 47d 17:35; f4f616188d6b306b [19:55:56] (18 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.36] CV 0.1%, check 16.78s; ETA 47d 19:35; f4f616188d6b306b [19:58:29] (19 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.36] CV 0.1%, check 16.89s; ETA 47d 19:43; f4f616188d6b306b [20:01:03] (20 errors)
EE 2500000 / 153900001 [ 1.62%], 27.22 ms/it [27.21, 27.30] CV 0.1%, check 17.05s; ETA 47d 16:34; f4f616188d6b306b [20:03:36] (21 errors)
EE 2500000 / 153900001 [ 1.62%], 27.22 ms/it [27.21, 27.27] CV 0.1%, check 17.13s; ETA 47d 16:41; f4f616188d6b306b [20:06:09] (22 errors)
EE 2500000 / 153900001 [ 1.62%], 27.22 ms/it [27.21, 27.27] CV 0.1%, check 17.19s; ETA 47d 16:33; f4f616188d6b306b [20:08:42] (23 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.36] CV 0.1%, check 17.07s; ETA 47d 19:43; f4f616188d6b306b [20:11:16] (24 errors)
EE 2500000 / 153900001 [ 1.62%], 27.28 ms/it [27.27, 27.33] CV 0.1%, check 17.13s; ETA 47d 19:27; f4f616188d6b306b [20:13:49] (25 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.33] CV 0.1%, check 16.69s; ETA 47d 17:58; f4f616188d6b306b [20:16:22] (26 errors)
EE 2500000 / 153900001 [ 1.62%], 27.25 ms/it [27.24, 27.33] CV 0.1%, check 16.82s; ETA 47d 18:06; f4f616188d6b306b [20:18:55] (27 errors)
EE 2500000 / 153900001 [ 1.62%], 27.29 ms/it [27.27, 27.33] CV 0.1%, check 16.54s; ETA 47d 19:51; f4f616188d6b306b [20:21:28] (28 errors)
EE 2500000 / 153900001 [ 1.62%], 27.22 ms/it [27.21, 27.30] CV 0.1%, check 16.89s; ETA 47d 16:34; f4f616188d6b306b [20:24:01] (29 errors)
[/CODE] Sometimes it should just give up on a worktodo entry and go on to the next one. Preferably sooner rather than later. All stops in this post, including this one, were manual intervention (Ctrl-C).[CODE]gpuOwL v1.9- GPU Mersenne primality checker
Radeon 500 Series 8 @f:0.0, gfx804 1203MHz

OpenCL compilation in 8533 ms, with "-I. -cl-fast-relaxed-math -cl-std=CL2.0 -DEXP=87969467u -DWIDTH=1024u -DHEIGHT=2048u -DLOG_NWORDS=22u -DFGT_61=1 -DLOG_ROOT2=49u "
Warning: high word size of 20.97 bits may result in errors
PRP-3: FFT 4M (1024 * 2048 * 2) of 87969467 (20.97 bits/word) [2018-02-05 16:20:48 Central Standard Time]
Starting at iteration 0
OK 0 / 87969467 [ 0.00%], 0.00 ms/it; ETA 0d 00:00; 0000000000000003 [16:20:59]
EE 1000 / 87969467 [ 0.00%], 18.92 ms/it [18.91, 18.94] CV 0.1%, check 10.82s; ETA 19d 06:22; c4af4f3b372f69aa [16:21:29]
EE 1000 / 87969467 [ 0.00%], 19.06 ms/it [18.94, 19.19] CV 0.9%, check 10.90s; ETA 19d 09:49; c4af4f3b372f69aa [16:21:59] (1 errors)
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.94s; ETA 19d 06:44; c4af4f3b372f69aa [16:22:29] (2 errors)
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.94s; ETA 19d 06:44; c4af4f3b372f69aa [16:22:59] (3 errors)
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.83s; ETA 19d 06:44; c4af4f3b372f69aa [16:23:28] (4 errors)
EE 1000 / 87969467 [ 0.00%], 19.45 ms/it [18.97, 19.94] CV 3.5%, check 10.94s; ETA 19d 19:20; c4af4f3b372f69aa [16:23:59] (5 errors)
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.92s; ETA 19d 06:44; c4af4f3b372f69aa [16:24:29] (6 errors)
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.86s; ETA 19d 06:44; c4af4f3b372f69aa [16:24:58] (7 errors)

Stopping, please wait..
EE 1000 / 87969467 [ 0.00%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.89s; ETA 19d 06:44; c4af4f3b372f69aa [16:25:28] (8 errors)

Bye
[/CODE]Manual intervention with restarts and well timed Ctrl-C seem to be required, to determine at what point (500 iteration resolution) the computation goes from OK to EE. [CODE]gpuOwL v1.9- GPU Mersenne primality checker
Radeon 500 Series 8 @f:0.0, gfx804 1203MHz

OpenCL compilation in 8314 ms, with "-I. -cl-fast-relaxed-math -cl-std=CL2.0 -DEXP=83873509u -DWIDTH=1024u -DHEIGHT=2048u -DLOG_NWORDS=22u -DFGT_61=1 -DLOG_ROOT2=49u "
Warning: high word size of 20.00 bits may result in errors
PRP-3: FFT 4M (1024 * 2048 * 2) of 83873509 (20.00 bits/word) [2018-02-08 16:05:44 Central Standard Time]
Starting at iteration 0
OK 0 / 83873509 [ 0.00%], 0.00 ms/it; ETA 0d 00:00; 0000000000000003 [16:05:55]
OK 1000 / 83873509 [ 0.00%], 18.94 ms/it [18.94, 18.94] CV 0.0%, check 10.98s; ETA 18d 09:13; a45d18b389042c04 [16:06:25]
OK 5000 / 83873509 [ 0.01%], 18.92 ms/it [18.91, 18.94] CV 0.1%, check 10.95s; ETA 18d 08:49; 486f0a4bac31a4c1 [16:07:51]
OK 10000 / 83873509 [ 0.01%], 19.03 ms/it [18.91, 19.97] CV 1.7%, check 11.04s; ETA 18d 11:25; c37cc58a6ab272f9 [16:09:37]
EE 20000 / 83873509 [ 0.02%], 19.00 ms/it [18.91, 19.97] CV 1.3%, check 10.92s; ETA 18d 10:32; 8e0d89f0ea163aed [16:12:58]
OK 12000 / 83873509 [ 0.01%], 18.94 ms/it [18.91, 18.97] CV 0.1%, check 11.08s; ETA 18d 09:09; fc573167796b6eec [16:13:47] (1 errors)
OK 15000 / 83873509 [ 0.02%], 19.10 ms/it [18.91, 19.94] CV 2.2%, check 11.01s; ETA 18d 12:53; fa3ec3b940dc9020 [16:14:56] (1 errors)
EE 20000 / 83873509 [ 0.02%], 18.97 ms/it [18.91, 19.34] CV 0.7%, check 10.92s; ETA 18d 09:50; 8e0d89f0ea163aed [16:16:41] (1 errors)
OK 16000 / 83873509 [ 0.02%], 18.95 ms/it [18.94, 18.97] CV 0.1%, check 10.94s; ETA 18d 09:29; 6baac2e16df6b1b4 [16:17:11] (2 errors)
EE 18000 / 83873509 [ 0.02%], 19.19 ms/it [18.91, 20.00] CV 2.8%, check 10.97s; ETA 18d 14:56; e8e7634fe0e7d4e6 [16:18:01] (2 errors)
EE 18000 / 83873509 [ 0.02%], 18.93 ms/it [18.91, 18.97] CV 0.2%, check 11.03s; ETA 18d 08:56; e8e7634fe0e7d4e6 [16:18:49] (3 errors)

Stopping, please wait..
OK 17000 / 83873509 [ 0.02%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 11.00s; ETA 18d 09:07; daed40b448693c25 [16:19:19] (4 errors)

Bye
gpuOwL v1.9- GPU Mersenne primality checker
Radeon 500 Series 8 @f:0.0, gfx804 1203MHz

OpenCL compilation in 8299 ms, with "-I. -cl-fast-relaxed-math -cl-std=CL2.0 -DEXP=83873509u -DWIDTH=1024u -DHEIGHT=2048u -DLOG_NWORDS=22u -DFGT_61=1 -DLOG_ROOT2=49u "
Warning: high word size of 20.00 bits may result in errors
PRP-3: FFT 4M (1024 * 2048 * 2) of 83873509 (20.00 bits/word) [2018-02-08 16:19:40 Central Standard Time]
Starting at iteration 17000
OK 17000 / 83873509 [ 0.02%], 0.00 ms/it; ETA 0d 00:00; daed40b448693c25 [16:19:52] (4 errors)
EE 18000 / 83873509 [ 0.02%], 18.94 ms/it [18.91, 18.97] CV 0.2%, check 10.86s; ETA 18d 09:06; e8e7634fe0e7d4e6 [16:20:22] (4 errors)

Stopping, please wait..
EE 17500 / 83873509 [ 0.02%], 18.97 ms/it; ETA 18d 09:50; 1db1b693a4cf5a3f [16:20:42] (5 errors)

Bye[/CODE]

kriesel 2018-02-10 16:39

iteration count cap feature request
 
It would be useful for certain purposes to be able to specify the program perform a specified number of iteration attempts on a worktodo entry or command line input.

Note, iteration attempts count is a different distinction than the iteration number reached. (If errors occur, for example, attempt count may increase by hundreds of thousands while iteration number reached climbs and then retreats upon error detection, and iteration number reached without error during the run does not change.)

Some uses:
produce a residue after a specified number of iterations for a given exponent. These could be useful in validating the output of other PRP programs. Total iterations performed would be the same as iteration number reached, if no errors were encountered or retries run.

limit the amount of time spent in on-error repeated retry, by imposing a cap. Aids efficient use of computing resources.

terminate a test of whether an exponent will run reliably on a given fft length, at a specified total iterations run count. This in combination with an exponent sample list in the worktodo file, in effect automates somewhat, testing the fft boundaries for usable upper limit.

GP2 2018-02-10 17:18

[QUOTE=preda;479700]This was done with the new 5000K FFT: 78330167
[url]https://www.mersenne.org/report_exponent/?exp_lo=78330167&full=1[/url][/QUOTE]

OK, I started this with mprime. Using one core, so it will take a while. I can do others in parallel.

kriesel 2018-02-19 04:08

V1.9 terminated on exponent completion
 
On Windows 7, the gpuOwL V1.9 program terminated with "Bye" after logging and printing out a completed primality test, instead of continuing to additional work that was present in the worktodo file. The entry for the just-completed exponent remained in the worktodo file afterward. Maybe that's the intended behavior, but it was not expected.

preda 2018-02-22 03:10

[QUOTE=kriesel;480411]On Windows 7, the gpuOwL V1.9 program terminated with "Bye" after logging and printing out a completed primality test, instead of continuing to additional work that was present in the worktodo file. The entry for the just-completed exponent remained in the worktodo file afterward. Maybe that's the intended behavior, but it was not expected.[/QUOTE]

No, this is a bug I need to look into.

There was a problem reported concerning how worktodo was updated on Windows (where files have text vs. binary modes). I attempted to fix that problem, and instead introduced this new one. AFAIK it seems to affect Windows only.

kriesel 2018-02-24 15:04

gpuowl stalled
 
This occurred while I was in another zip code, no user interaction.

[CODE]...
OK 24000000 / 77973559 [30.78%], 10.84 ms/it [10.73, 11.57] CV 1.4%, check 7.10s; ETA 6d 18:31; 315ac83054e9dd41 [22:04:13]
OK 24500000 / 77973559 [31.42%], 10.84 ms/it [10.73, 11.89] CV 1.5%, check 7.00s; ETA 6d 17:03; 4d55c763f8d09f46 [23:34:41][/CODE]

(Upon return and checking, noted last time stamp was almost a day old, much longer than the usual 90 minute log interval, applied Ctrl-C)

[CODE]Stopping, please wait..
OK 24813000 / 77973559 [31.82%], 521.14 ms/it [10.73, 311278.47] CV 2387.2%, check 20.64s; ETA 320d 15:40; a1e4c8d37a60d4dc [20:53:39]

Bye[/CODE]
A stroll through the Windows system event log indicates Windows has been putting itself to sleep periodically, mostly 15 minute intervals. That one was a 1.8-day nap, despite gpu-intensive software running on both gpus installed, plus prime95 on all cpu cores. Thought I had it set not to take naps ever, weeks ago.

kriesel 2018-02-28 14:40

worktodo.add feature request
 
Please add this functionality like mfaktc and mfakto have.
A file of new work called worktodo.add can be dropped in the program folder for the gpu worker program to notice later, and append to the end of the work file, then remove the worktodo.add file. This approach is also used by prime95 and mprime, allowing safely replenishing queued work without stopping and restarting the program.


All times are UTC. The time now is 22:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.