![]() |
|
|
#859 | |
|
2·1,523 Posts |
Quote:
Thank you now it is clear. This is a good attempt to get a stable measure, in the latest test the ms/sq time is 0.18 everywhere except for the last iteration which is 0.19 ms/sq. Also the introduction of smaller FFT size is good. Now the program can be validated on new hardware with a quick test against the smallest prime. |
|
|
|
|
#860 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
RX550, AMD Adrenaline 18.10.2 driver for Win7 x64
m89000167 5000K for v2.0, 5120k for others (iterations 10000-20000) Ver ms/it (no P-1 or TF) 2.0 17.38 3.3 16.90 3.5 16.42 <--min 3.6 16.44 3.8 16.43 3.9 17.22 4.3 17.35 4.6 17.25 4.7 NA 5.0 17.25 Note the more exactly comparable methodology in this report than the RX480 timings reported earlier, and the different location of the minimum. Difference v3.5-3.8 is +-1 least digit, so may be insignificant |
|
|
|
|
|
#861 | |
|
167248 Posts |
Quote:
The numbers are not exactly comparable by now: Code:
2018-11-04 16:24:42 gpuowl 5.0--mod 2018-11-04 16:24:42 RX580 -user selroc -cpu RX580 -device 0 2018-11-04 16:24:42 RX580 89000167 FFT 5120K: Width 256x4, Height 64x8, Middle 5; 16.98 bits/word 2018-11-04 16:24:42 RX580 using short carry kernels 2018-11-04 16:24:43 RX580 gfx803-36x1360-@4a:0.0 Ellesmere [Radeon RX 470/480] 2018-11-04 16:24:44 RX580 OpenCL compilation in 1076 ms, with "-DEXP=89000167u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=5u -I. -cl-fast-relaxed-math -cl-std=CL2.0 " 2018-11-04 16:24:44 RX580 89000167.owl not found, starting from the beginning. 2018-11-04 16:24:52 RX580 89000167 OK 800 0.00%; 4.44 ms/sq, 0 MULs; ETA 4d 13:50; 2744231e7051f3fe (check 1.95s) 2018-11-04 16:25:33 RX580 89000167 10000 0.01%; 4.46 ms/sq, 0 MULs; ETA 4d 14:20; 2a55d51cdf0d91cb 2018-11-04 16:26:18 RX580 89000167 20000 0.02%; 4.47 ms/sq, 0 MULs; ETA 4d 14:35; 8dcb0029e791db2a 2018-11-04 16:27:03 RX580 89000167 30000 0.03%; 4.48 ms/sq, 0 MULs; ETA 4d 14:37; 2fbd246d68f86f29 2018-11-04 16:27:47 RX580 89000167 40000 0.04%; 4.48 ms/sq, 0 MULs; ETA 4d 14:47; d85f84a6744d7090 2018-11-04 16:28:32 RX580 89000167 50000 0.06%; 4.49 ms/sq, 0 MULs; ETA 4d 14:50; afa46f7cdc5ffb7d 2018-11-04 16:29:17 RX580 89000167 60000 0.07%; 4.49 ms/sq, 0 MULs; ETA 4d 14:49; 98906e9529e4667f 2018-11-04 16:30:02 RX580 89000167 70000 0.08%; 4.49 ms/sq, 0 MULs; ETA 4d 14:49; 90b5e67934fcdcff 2018-11-04 16:30:07 RX580 Stopping, please wait.. 2018-11-04 16:30:09 RX580 89000167 OK 71200 0.08%; 4.49 ms/sq, 0 MULs; ETA 4d 14:51; 14f11cfb55a43415 (check 1.96s) 2018-11-04 16:30:09 RX580 Exiting because "stop requested" 2018-11-04 16:30:09 RX580 Bye |
|
|
|
|
#862 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
I claim the numbers are very comparable for the same gpu, versus gpuowl version, despite the ms/it vs. ms/sq difference in labeling, as long as all versions were run with no P-1 activity, as my recent benchmark resuls for RX550 and RX480 posted for m89000167 were. Speed difference between RX550 and RX480 or RX580 is expected to be a considerable ratio. RX550 is a low wattage slow card. Running the same iteration span for each gpuowl version made the RX550 timings more comparable than my earlier m89m benchmarking for the RX480, which was successive iteration ranges (not same iteration span for different gpuowl versions).
Last fiddled with by kriesel on 2018-11-04 at 16:03 |
|
|
|
|
|
#863 | |
|
23×5×112 Posts |
Quote:
Our numbers come from different gpus, different operating systems, different drivers, a difference must be accounted for. |
|
|
|
|
#864 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Code:
C:\msys64\home\ken\gpuowl-compile\v5x>openowl -h 2018-11-04 10:20:41 gpuowl 5.0-df2bdf2 Command line options: -user <name> : specify the user name. -cpu <name> : specify the hardware name. -time : display kernel profiling information. -fft <size> : specify FFT size, such as: 5000K, 4M, +2, -1. -block <value> : PRP GEC block size. Default 400. Smaller block is slower but detects errors sooner. -carry long|short : force carry type. Short carry may be faster, but requires high bits/word. -list fft : display a list of available FFT configurations. -tf <bit-offset> : enable auto trial factoring before PRP. Pass 0 to bit-offset for default TF depth. -device <N> : select a specific device: 0 : Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics 1 : gfx804-8x1203-@3:0.0 Radeon 550 Series C:\msys64\home\ken\gpuowl-compile\v5x>openowl -list fft 2018-11-04 10:20:56 gpuowl 5.0-df2bdf2 2018-11-04 10:20:56 -list fft 2018-11-04 10:20:56 Can't open 'worktodo.txt' (mode 'rb') 2018-11-04 10:20:56 Bye Last fiddled with by kriesel on 2018-11-04 at 16:45 |
|
|
|
|
|
#865 |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
Yes it makes sense. I'll look into implementing that.
|
|
|
|
|
|
#866 | |
|
32·7·59 Posts |
Quote:
I concur. |
|
|
|
|
#867 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
124758 Posts |
A quick check shows it will run at least a few known primes m216091 and up apparently correctly.
-list fft output: Code:
C:\msys64\home\ken\gpuowl-compile\v5x>openowl -list fft 2018-11-04 10:20:56 gpuowl 5.0-df2bdf2 2018-11-04 10:20:56 -list fft 2018-11-04 10:20:56 Can't open 'worktodo.txt' (mode 'rb') 2018-11-04 10:20:56 Bye C:\msys64\home\ken\gpuowl-compile\v5x>openowl -list fft 2018-11-04 10:28:00 gpuowl 5.0-df2bdf2 2018-11-04 10:28:00 -list fft 2018-11-04 10:28:00 FFT maxExp W H M 2018-11-04 10:28:00 0.1M 2.6M 256 256 1 2018-11-04 10:28:00 0.2M 5.2M 256 512 1 2018-11-04 10:28:00 0.2M 5.2M 512 256 1 2018-11-04 10:28:00 0.5M 10.2M 1024 256 1 2018-11-04 10:28:00 0.5M 10.2M 256 1024 1 2018-11-04 10:28:00 0.5M 10.2M 512 512 1 2018-11-04 10:28:00 0.6M 12.7M 256 256 5 2018-11-04 10:28:00 1.0M 20.0M 1024 512 1 2018-11-04 10:28:00 1.0M 20.0M 256 2048 1 2018-11-04 10:28:00 1.0M 20.0M 512 1024 1 2018-11-04 10:28:00 1.0M 20.0M 2048 256 1 2018-11-04 10:28:00 1.1M 22.5M 256 256 9 2018-11-04 10:28:00 1.2M 24.9M 256 512 5 2018-11-04 10:28:00 1.2M 24.9M 512 256 5 2018-11-04 10:28:00 2.0M 39.3M 1024 1024 1 2018-11-04 10:28:00 2.0M 39.3M 512 2048 1 2018-11-04 10:28:00 2.0M 39.3M 2048 512 1 2018-11-04 10:28:00 2.0M 39.3M 4096 256 1 2018-11-04 10:28:00 2.2M 44.1M 256 512 9 2018-11-04 10:28:00 2.2M 44.1M 512 256 9 2018-11-04 10:28:00 2.5M 48.9M 1024 256 5 2018-11-04 10:28:00 2.5M 48.9M 256 1024 5 2018-11-04 10:28:00 2.5M 48.9M 512 512 5 2018-11-04 10:28:00 4.0M 77.3M 1024 2048 1 2018-11-04 10:28:00 4.0M 77.3M 2048 1024 1 2018-11-04 10:28:00 4.0M 77.3M 4096 512 1 2018-11-04 10:28:00 4.5M 86.7M 1024 256 9 2018-11-04 10:28:00 4.5M 86.7M 256 1024 9 2018-11-04 10:28:00 4.5M 86.7M 512 512 9 2018-11-04 10:28:00 5.0M 96.1M 1024 512 5 2018-11-04 10:28:00 5.0M 96.1M 256 2048 5 2018-11-04 10:28:00 5.0M 96.1M 512 1024 5 2018-11-04 10:28:00 5.0M 96.1M 2048 256 5 2018-11-04 10:28:00 8.0M 151.8M 2048 2048 1 2018-11-04 10:28:00 8.0M 151.8M 4096 1024 1 2018-11-04 10:28:00 9.0M 170.3M 1024 512 9 2018-11-04 10:28:00 9.0M 170.3M 256 2048 9 2018-11-04 10:28:00 9.0M 170.3M 512 1024 9 2018-11-04 10:28:00 9.0M 170.3M 2048 256 9 2018-11-04 10:28:00 10.0M 188.7M 1024 1024 5 2018-11-04 10:28:00 10.0M 188.7M 512 2048 5 2018-11-04 10:28:00 10.0M 188.7M 2048 512 5 2018-11-04 10:28:00 10.0M 188.7M 4096 256 5 2018-11-04 10:28:00 16.0M 298.1M 4096 2048 1 2018-11-04 10:28:00 18.0M 334.3M 1024 1024 9 2018-11-04 10:28:00 18.0M 334.3M 512 2048 9 2018-11-04 10:28:00 18.0M 334.3M 2048 512 9 2018-11-04 10:28:00 18.0M 334.3M 4096 256 9 2018-11-04 10:28:00 20.0M 370.4M 1024 2048 5 2018-11-04 10:28:00 20.0M 370.4M 2048 1024 5 2018-11-04 10:28:00 20.0M 370.4M 4096 512 5 2018-11-04 10:28:00 36.0M 656.2M 1024 2048 9 2018-11-04 10:28:00 36.0M 656.2M 2048 1024 9 2018-11-04 10:28:00 36.0M 656.2M 4096 512 9 2018-11-04 10:28:00 40.0M 727.0M 2048 2048 5 2018-11-04 10:28:00 40.0M 727.0M 4096 1024 5 2018-11-04 10:28:00 72.0M 1287.5M 2048 2048 9 2018-11-04 10:28:00 72.0M 1287.5M 4096 1024 9 2018-11-04 10:28:00 80.0M 1426.4M 4096 2048 5 2018-11-04 10:28:00 144.0M 2525.2M 4096 2048 9 (not surprising due to low # of bits/word, and not a request for yet smaller fft lengths, just observations) Code:
C:\msys64\home\ken\gpuowl-compile\v5x>openowl
2018-11-04 10:33:00 gpuowl 5.0-df2bdf2
2018-11-04 10:33:00 110503 FFT 128K: Width 64x4, Height 64x4; 0.84 bits/word
2018-11-04 10:33:00 using long carry kernels
2018-11-04 10:33:00 Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics
2018-11-04 10:33:03 OpenCL compilation in 2391 ms, with "-DEXP=110503u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=1u -I. -cl-fast-relaxed-math -cl-std=CL2.0 "
2018-11-04 10:33:03 110503.owl not found, starting from the beginning.
2018-11-04 10:33:03 powerSmooth(110503, 10000) has 14484 bits
Assertion failed!
Program: C:\msys64\home\ken\gpuowl-compile\v5x\openowl.exe
File: state.cpp, Line 24
Expression: 0 <= w && w < (1 << nBits)
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
C:\msys64\home\ken\gpuowl-compile\v5x>openowl
2018-11-04 10:33:18 gpuowl 5.0-df2bdf2
2018-11-04 10:33:18 132049 FFT 128K: Width 64x4, Height 64x4; 1.01 bits/word
2018-11-04 10:33:18 using long carry kernels
2018-11-04 10:33:19 Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics
2018-11-04 10:33:21 OpenCL compilation in 2432 ms, with "-DEXP=132049u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=1u -I. -cl-fast-relaxed-math -cl-std=CL2.0 "
2018-11-04 10:33:21 132049.owl not found, starting from the beginning.
2018-11-04 10:33:21 powerSmooth(132049, 10000) has 14484 bits
2018-11-04 10:33:23 132049 P-1 10000 69.04%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; d8645cee5574c284
2018-11-04 10:33:24 132049.owl loaded: k 0, B1 10000, block 400, res64 6379d2d731e5e48e, stage 1, baseBits 0
2018-11-04 10:33:24 132049 B1=10000 B2=70000 (effective B2=70000) selected 4142 P-1 points in 0.01s
2018-11-04 10:33:24 132049 EE 800 0.60%; 0.16 ms/sq, 1 MULs; ETA 0d 00:00; da4be711d7cf309d (check 0.08s)
2018-11-04 10:33:24 132049.owl loaded: k 0, B1 10000, block 400, res64 6379d2d731e5e48e, stage 1, baseBits 0
2018-11-04 10:33:24 132049 EE 800 0.60%; 0.28 ms/sq, 1 MULs; ETA 0d 00:01; da4be711d7cf309d (check 0.08s)
2018-11-04 10:33:24 132049.owl loaded: k 0, B1 10000, block 400, res64 6379d2d731e5e48e, stage 1, baseBits 0
2018-11-04 10:33:25 132049 EE 800 0.60%; 0.28 ms/sq, 1 MULs; ETA 0d 00:01; da4be711d7cf309d (check 0.08s)
2018-11-04 10:33:25 3 sequential errors, will stop.
2018-11-04 10:33:25 Exiting because "too many errors"
2018-11-04 10:33:25 Bye
C:\msys64\home\ken\gpuowl-compile\v5x>openowl
2018-11-04 10:33:42 gpuowl 5.0-df2bdf2
2018-11-04 10:33:42 216091 FFT 128K: Width 64x4, Height 64x4; 1.65 bits/word
2018-11-04 10:33:42 using long carry kernels
2018-11-04 10:33:43 Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics
2018-11-04 10:33:46 OpenCL compilation in 2434 ms, with "-DEXP=216091u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=1u -I. -cl-fast-relaxed-math -cl-std=CL2.0 "
2018-11-04 10:33:46 216091.owl not found, starting from the beginning.
2018-11-04 10:33:46 powerSmooth(216091, 10000) has 14484 bits
2018-11-04 10:33:48 216091 P-1 10000 69.04%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 9e7518aa03950b26
2018-11-04 10:33:48 216091.owl loaded: k 0, B1 10000, block 400, res64 d8a71ba2415f2773, stage 1, baseBits 0
2018-11-04 10:33:48 216091 B1=10000 B2=130000 (effective B2=130000) selected 7611 P-1 points in 0.02s
2018-11-04 10:33:49 216091 OK 800 0.37%; 0.17 ms/sq, 1 MULs; ETA 0d 00:01; 5646ce8634d76602 (check 0.08s)
2018-11-04 10:33:49 216091 GCD no factor (0.07s)
2018-11-04 10:33:50 216091 10000 4.62%; 0.16 ms/sq, 287 MULs; ETA 0d 00:01; 6d0028f1a3744d15
2018-11-04 10:33:52 216091 20000 9.24%; 0.16 ms/sq, 1067 MULs; ETA 0d 00:01; e4c865e7e023a233
2018-11-04 10:33:54 216091 30000 13.86%; 0.16 ms/sq, 1053 MULs; ETA 0d 00:01; 52c18ca42b2e5a40
2018-11-04 10:33:55 216091 40000 18.48%; 0.16 ms/sq, 990 MULs; ETA 0d 00:01; 003716dc307b3768
2018-11-04 10:33:57 216091 50000 23.11%; 0.16 ms/sq, 882 MULs; ETA 0d 00:00; f19c985e00f9ab66
2018-11-04 10:33:59 216091 60000 27.73%; 0.16 ms/sq, 794 MULs; ETA 0d 00:00; 6679d5a415aece9e
2018-11-04 10:34:01 216091 70000 32.35%; 0.16 ms/sq, 566 MULs; ETA 0d 00:00; da83220b76e8b55b
2018-11-04 10:34:02 216091 80000 36.97%; 0.16 ms/sq, 339 MULs; ETA 0d 00:00; 3700d4ccd97a326a
2018-11-04 10:34:04 216091 90000 41.59%; 0.16 ms/sq, 346 MULs; ETA 0d 00:00; c0472c1f976aa2c1
2018-11-04 10:34:05 216091 100000 46.21%; 0.16 ms/sq, 329 MULs; ETA 0d 00:00; d3f402b7fb9adb65
2018-11-04 10:34:07 216091 110000 50.83%; 0.16 ms/sq, 304 MULs; ETA 0d 00:00; a625b1471a8e6481
2018-11-04 10:34:09 216091 120000 55.45%; 0.16 ms/sq, 328 MULs; ETA 0d 00:00; fe8081748d1a088e
2018-11-04 10:34:10 216091 130000 60.07%; 0.16 ms/sq, 325 MULs; ETA 0d 00:00; b38dbc5b3d73c584
2018-11-04 10:34:12 216091 140000 64.70%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; a71c266b43cc9171
2018-11-04 10:34:14 216091 150000 69.32%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; a6a2e15e86701788
2018-11-04 10:34:15 216091 OK 160000 73.94%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 6cc0b6cbc453946a (check 0.09s)
2018-11-04 10:34:17 216091 170000 78.56%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 16838b06bd004e23
2018-11-04 10:34:18 216091 180000 83.18%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 38a44921f392a2fc
2018-11-04 10:34:20 216091 190000 87.80%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 63580cfe1f80b303
2018-11-04 10:34:22 216091 200000 92.42%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; 7c3f2446e5e6fd09
2018-11-04 10:34:23 216091 210000 97.04%; 0.16 ms/sq, 0 MULs; ETA 0d 00:00; b6e9bb0a7c8ede6b
2018-11-04 10:34:24 PP 216090 / 216091, d8a71ba2415f2773 (base d8a71ba2415f2773)
2018-11-04 10:34:24 216091 OK 216400 100.00%; 0.17 ms/sq, 0 MULs; ETA 0d 00:00; e898188ce32335d4 (check 0.09s)
2018-11-04 10:34:24 {"exponent":"216091", "worktype":"PRP,P-1", "status":"P", "program":{"name":"gpuowl", "version":"5.0-df2bdf2"}, "timestamp":"2018-11-04 16:3
4:24 UTC", "aid":"0", "fft-length":131072, "res64":"d8a71ba2415f2773", "b2":"130000", "base":{"b1":"10000", "bias":{"2":19}, "res64":"d8a71ba2415f2773"}}
Code:
{"exponent":"216091", "worktype":"PRP,P-1", "status":"P", "program":{"name":"gpuowl", "version":"5.0-df2bdf2"}, "timestamp":"2018-11-04 16:34:24 UTC", "aid":"0", "fft-length":131072, "res64":"d8a71ba2415f2773", "b2":"130000", "base":{"b1":"10000", "bias":{"2":19}, "res64":"d8a71ba2415f2773"}}
{"exponent":"756839", "worktype":"PRP,P-1", "status":"P", "program":{"name":"gpuowl", "version":"5.0-df2bdf2"}, "timestamp":"2018-11-04 16:36:55 UTC", "aid":"0", "fft-length":131072, "res64":"0e12589efe2be6c5", "b2":"500000", "base":{"b1":"20000", "bias":{"2":19}, "res64":"0e12589efe2be6c5"}}
{"exponent":"859433", "worktype":"PRP,P-1", "status":"P", "program":{"name":"gpuowl", "version":"5.0-df2bdf2"}, "timestamp":"2018-11-04 16:39:23 UTC", "aid":"0", "fft-length":131072, "res64":"ac86e7a51cecadb0", "b2":"580000", "base":{"b1":"20000", "bias":{"2":19}, "res64":"ac86e7a51cecadb0"}}
|
|
|
|
|
|
#868 |
|
"Mihai Preda"
Apr 2015
137110 Posts |
I just added an FFT-3 "middle" step.
|
|
|
|
|
|
#869 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
124758 Posts |
Code:
$ make openowl-win
g++ -std=c++17 -O2 -DREV=\"9c13870\" -Wall Worktodo.cpp Result.cpp common.cpp gpuowl.cpp Gpu.cpp clwrap.cpp Task.cpp checkpoint.cpp timeutil.cpp Args.cpp GCD.cpp Primes.cpp Stats.cpp state.cpp Signal.cpp -o openowl -lOpenCL -lgmp -pthread -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -L. -static
Gpu.cpp: In member function 'PRPState Gpu::loadPRP(u32, u32, u32)':
Gpu.cpp:557:9: warning: unknown conversion type character 'l' in format [-Wformat=]
log("%u EE loaded: %d, B1 %u, blockSize %d, %016llx (expected %016llx)\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Gpu.cpp:557:9: warning: unknown conversion type character 'l' in format [-Wformat=]
Gpu.cpp:557:9: warning: too many arguments for format [-Wformat-extra-args]
Gpu.cpp: In member function 'PRPResult Gpu::isPrimePRP(u32, const Args&, u32, u32)':
Gpu.cpp:690:11: warning: unknown conversion type character 'l' in format [-Wformat=]
log("%s %8d / %d, %016llx (base %016llx)\n", isPrime ? "PP" : "CC", kEnd, E, finalRes64, residue(base));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Gpu.cpp:690:11: warning: unknown conversion type character 'l' in format [-Wformat=]
Gpu.cpp:690:11: warning: too many arguments for format [-Wformat-extra-args]
checkpoint.cpp: In member function 'void PRPState::loadInt(u32, u32, u32)':
checkpoint.cpp:167:7: warning: unknown conversion type character 'l' in format [-Wformat=]
log("%s loaded: k %u, B1 %u, block %u, res64 %016llx, stage %u, baseBits %u\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
checkpoint.cpp:167:7: warning: format '%u' expects argument of type 'unsigned int', but argument 6 has type 'u64' {aka 'long long unsigned int'} [-Wformat=]
checkpoint.cpp:167:7: warning: too many arguments for format [-Wformat-extra-args]
Code:
C:\msys64\home\ken\gpuowl-compile\v5.0-9c13870>openowl -h
2018-11-04 16:07:53 gpuowl 5.0-9c13870
Command line options:
-user <name> : specify the user name.
-cpu <name> : specify the hardware name.
-time : display kernel profiling information.
-fft <size> : specify FFT size, such as: 5000K, 4M, +2, -1.
-block <value> : PRP GEC block size. Default 400. Smaller block is slower but detects errors sooner.
-carry long|short : force carry type. Short carry may be faster, but requires high bits/word.
-list fft : display a list of available FFT configurations.
-tf <bit-offset> : enable auto trial factoring before PRP. Pass 0 to bit-offset for default TF depth.
-device <N> : select a specific device:
0 : Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics
1 : gfx804-8x1203-@3:0.0 Radeon 550 Series
C:\msys64\home\ken\gpuowl-compile\v5.0-9c13870>openowl -list fft
2018-11-04 16:08:03 gpuowl 5.0-9c13870
2018-11-04 16:08:03 -list fft
2018-11-04 16:08:03 Can't open 'worktodo.txt' (mode 'rb')
2018-11-04 16:08:03 Bye
C:\msys64\home\ken\gpuowl-compile\v5.0-9c13870>copy ..\v3.8\worktodo.txt .
1 file(s) copied.
C:\msys64\home\ken\gpuowl-compile\v5.0-9c13870>openowl -list fft
2018-11-04 16:08:34 gpuowl 5.0-9c13870
2018-11-04 16:08:34 -list fft
2018-11-04 16:08:34 FFT maxExp W H M
2018-11-04 16:08:34 0.1M 2.6M 256 256 1
2018-11-04 16:08:34 0.2M 5.2M 256 512 1
2018-11-04 16:08:34 0.2M 5.2M 512 256 1
2018-11-04 16:08:34 0.4M 7.7M 256 256 3
2018-11-04 16:08:34 0.5M 10.2M 1024 256 1
2018-11-04 16:08:34 0.5M 10.2M 256 1024 1
2018-11-04 16:08:34 0.5M 10.2M 512 512 1
2018-11-04 16:08:34 0.6M 12.7M 256 256 5
2018-11-04 16:08:34 0.8M 15.1M 256 512 3
2018-11-04 16:08:34 0.8M 15.1M 512 256 3
2018-11-04 16:08:34 1.0M 20.0M 1024 512 1
2018-11-04 16:08:34 1.0M 20.0M 256 2048 1
2018-11-04 16:08:34 1.0M 20.0M 512 1024 1
2018-11-04 16:08:34 1.0M 20.0M 2048 256 1
2018-11-04 16:08:34 1.1M 22.5M 256 256 9
2018-11-04 16:08:34 1.2M 24.9M 256 512 5
2018-11-04 16:08:34 1.2M 24.9M 512 256 5
2018-11-04 16:08:34 1.5M 29.7M 1024 256 3
2018-11-04 16:08:34 1.5M 29.7M 256 1024 3
2018-11-04 16:08:35 1.5M 29.7M 512 512 3
2018-11-04 16:08:35 2.0M 39.3M 1024 1024 1
2018-11-04 16:08:35 2.0M 39.3M 512 2048 1
2018-11-04 16:08:35 2.0M 39.3M 2048 512 1
2018-11-04 16:08:35 2.0M 39.3M 4096 256 1
2018-11-04 16:08:35 2.2M 44.1M 256 512 9
2018-11-04 16:08:35 2.2M 44.1M 512 256 9
2018-11-04 16:08:35 2.5M 48.9M 1024 256 5
2018-11-04 16:08:35 2.5M 48.9M 256 1024 5
2018-11-04 16:08:35 2.5M 48.9M 512 512 5
2018-11-04 16:08:35 3.0M 58.4M 1024 512 3
2018-11-04 16:08:35 3.0M 58.4M 256 2048 3
2018-11-04 16:08:35 3.0M 58.4M 512 1024 3
2018-11-04 16:08:35 3.0M 58.4M 2048 256 3
2018-11-04 16:08:35 4.0M 77.3M 1024 2048 1
2018-11-04 16:08:35 4.0M 77.3M 2048 1024 1
2018-11-04 16:08:35 4.0M 77.3M 4096 512 1
2018-11-04 16:08:35 4.5M 86.7M 1024 256 9
2018-11-04 16:08:35 4.5M 86.7M 256 1024 9
2018-11-04 16:08:35 4.5M 86.7M 512 512 9
2018-11-04 16:08:35 5.0M 96.1M 1024 512 5
2018-11-04 16:08:35 5.0M 96.1M 256 2048 5
2018-11-04 16:08:35 5.0M 96.1M 512 1024 5
2018-11-04 16:08:35 5.0M 96.1M 2048 256 5
2018-11-04 16:08:35 6.0M 114.7M 1024 1024 3
2018-11-04 16:08:35 6.0M 114.7M 512 2048 3
2018-11-04 16:08:35 6.0M 114.7M 2048 512 3
2018-11-04 16:08:35 6.0M 114.7M 4096 256 3
2018-11-04 16:08:35 8.0M 151.8M 2048 2048 1
2018-11-04 16:08:35 8.0M 151.8M 4096 1024 1
2018-11-04 16:08:35 9.0M 170.3M 1024 512 9
2018-11-04 16:08:35 9.0M 170.3M 256 2048 9
2018-11-04 16:08:35 9.0M 170.3M 512 1024 9
2018-11-04 16:08:35 9.0M 170.3M 2048 256 9
2018-11-04 16:08:35 10.0M 188.7M 1024 1024 5
2018-11-04 16:08:35 10.0M 188.7M 512 2048 5
2018-11-04 16:08:35 10.0M 188.7M 2048 512 5
2018-11-04 16:08:35 10.0M 188.7M 4096 256 5
2018-11-04 16:08:35 12.0M 225.3M 1024 2048 3
2018-11-04 16:08:35 12.0M 225.3M 2048 1024 3
2018-11-04 16:08:35 12.0M 225.3M 4096 512 3
2018-11-04 16:08:35 16.0M 298.1M 4096 2048 1
2018-11-04 16:08:35 18.0M 334.3M 1024 1024 9
2018-11-04 16:08:35 18.0M 334.3M 512 2048 9
2018-11-04 16:08:35 18.0M 334.3M 2048 512 9
2018-11-04 16:08:35 18.0M 334.3M 4096 256 9
2018-11-04 16:08:35 20.0M 370.4M 1024 2048 5
2018-11-04 16:08:35 20.0M 370.4M 2048 1024 5
2018-11-04 16:08:35 20.0M 370.4M 4096 512 5
2018-11-04 16:08:35 24.0M 442.3M 2048 2048 3
2018-11-04 16:08:35 24.0M 442.3M 4096 1024 3
2018-11-04 16:08:35 36.0M 656.2M 1024 2048 9
2018-11-04 16:08:35 36.0M 656.2M 2048 1024 9
2018-11-04 16:08:35 36.0M 656.2M 4096 512 9
2018-11-04 16:08:35 40.0M 727.0M 2048 2048 5
2018-11-04 16:08:35 40.0M 727.0M 4096 1024 5
2018-11-04 16:08:35 48.0M 868.1M 4096 2048 3
2018-11-04 16:08:35 72.0M 1287.5M 2048 2048 9
2018-11-04 16:08:35 72.0M 1287.5M 4096 1024 9
2018-11-04 16:08:35 80.0M 1426.4M 4096 2048 5
2018-11-04 16:08:35 144.0M 2525.2M 4096 2048 9
Code:
... 2018-11-04 17:05:29 6972593 5960000 85.47%; 0.38 ms/sq, 0 MULs; ETA 0d 00:06; 9192684b7c1359cd 2018-11-04 17:05:33 6972593 5970000 85.62%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; c2f9539990824bd3 2018-11-04 17:05:36 6972593 5980000 85.76%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; 0e55f43c273e071f 2018-11-04 17:05:40 6972593 5990000 85.91%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; 1b238cbce00977ec 2018-11-04 17:05:44 6972593 6000000 86.05%; 0.38 ms/sq, 0 MULs; ETA 0d 00:06; 226f17e463c15782 2018-11-04 17:05:48 6972593 6010000 86.19%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; 37cb92ee936c55d2 2018-11-04 17:05:51 6972593 6020000 86.34%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; c1966294670fbb2f 2018-11-04 17:05:55 6972593 6030000 86.48%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; f03d90475b5f5672 2018-11-04 17:05:59 6972593 6040000 86.62%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; 3130d3e8833a08d3 2018-11-04 17:06:03 6972593 6050000 86.77%; 0.38 ms/sq, 0 MULs; ETA 0d 00:06; fd70900ff37a05c4 2018-11-04 17:06:06 6972593 6060000 86.91%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; a0ececf155185dba 2018-11-04 17:06:10 6972593 6070000 87.05%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; e869127794b701de 2018-11-04 17:06:14 6972593 OK 6080000 87.20%; 0.37 ms/sq, 0 MULs; ETA 0d 00:06; d11c156803bb5922 (check 0.19s) ... Code:
{"exponent":"6972593", "worktype":"PRP,P-1", "status":"P", "program":{"name":"gpuowl", "version":"5.0-9c13870"}, "timestamp":"2018-11-04 23:11:49 UTC", "aid":"0", "fft-length":393216, "res64":"bc16906ca9e08ff7", "b2":"1440000", "base":{"b1":"80000", "bias":{"2":19}, "res64":"bc16906ca9e08ff7"}}
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |