![]() |
![]() |
#144 |
"Mihai Preda"
Apr 2015
25138 Posts |
![]() |
![]() |
![]() |
![]() |
#145 |
"Mihai Preda"
Apr 2015
5·271 Posts |
![]()
For those who can, it may be a good idea to use -proof 9, which enables validation of the proof. The cost of the validation is 0.2% which is small enough (on the order of 2-3 minutes on R7), but it makes sure that the proof is good before beaming it up to the server.
|
![]() |
![]() |
![]() |
#146 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10011101011012 Posts |
![]() Quote:
123M PRP, v7.0-40, ok 150M PRP, V6.11-364 ok 177.8M LL, V6.11-364 ok 181M LL, V6.11-380 ok 190M LL, V6.11-364 ok 320M PRP, V7.0-40 ok 480M PRP, V7.1-1, ok 554M PRP, V7.1-1, ok 558M PRP/P-1, V7.1-1, ok 560M PRP/P-1, V7.1-1, ok 561 PRP/P-1, V7.1-1, out of resources error 562.6 PRP/P-1, V7.1-1, out of resources error 600M PRP/P-1, V7.1-1, out of resources error 642M PRP, V6.11-364 out of resources error 764M PRP, V6.11-364 out of resources error (previously reported below:) 843M PRP/P-1 V7.0-35 out of resources error 957M PRP/P-1 V7.0-35 out of resources error Last fiddled with by Prime95 on 2020-10-25 at 16:54 |
|
![]() |
![]() |
![]() |
#147 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3×23×73 Posts |
![]()
Gpuowl FFT limits from the help.txt files
v7.1-1: FFT 30M [ 47.19M - 560.64M] 1K:15:1K 4K:15:256 FFT 32M [ 50.33M - 599.62M] 4K:8:512 4K:4:1K V6.11-364: FFT 30M [ 47.19M - 560.64M] 1K:15:1K 4K:15:256 FFT 32M [ 50.33M - 599.62M] 4K:8:512 4K:4:1K 560.63M B1=3200000,B2=140000000;PRP=0,1,2,560630051,-1,83,2 STATS ok at 30M fft, maxalloc 14G, Radeon VII 560.65M B1=3200000,B2=140000000;PRP=0,1,2,560650067,-1,83,2 STATS at 32M fail with OUT_OF_RESOURCES error 560.63M forced to 32M with -fft 32M in config.txt: fail with OUT_OF_RESOURCES error -use STATS OUT_OF_RESOURCES fatal error appears to relate to fft size 32M or larger |
![]() |
![]() |
![]() |
#148 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
13AD16 Posts |
![]()
... and that behavior transition coincides with the change to 4K head, since I was testing with default fft selection.
Code:
FFT 30M [ 47.19M - 560.64M] 1K:15:1K 4K:15:256 FFT 32M [ 50.33M - 599.62M] 4K:8:512 4K:4:1K FFT 36M [ 56.62M - 671.04M] 4K:9:512 FFT 40M [ 62.91M - 743.74M] 4K:10:512 4K:5:1K FFT 44M [ 69.21M - 816.39M] 4K:11:512 FFT 48M [ 75.50M - 889.11M] 4K:12:512 4K:6:1K FFT 52M [ 81.79M - 961.97M] 4K:13:512 FFT 56M [ 88.08M - 1033.20M] 4K:14:512 4K:7:1K FFT 60M [ 94.37M - 1103.74M] 4K:15:512 FFT 64M [100.66M - 1177.31M] 4K:8:1K FFT 72M [113.25M - 1321.02M] 4K:9:1K FFT 80M [125.83M - 1464.31M] 4K:10:1K FFT 88M [138.41M - 1607.03M] 4K:11:1K FFT 96M [150.99M - 1751.79M] 4K:12:1K FFT 104M [163.58M - 1893.52M] 4K:13:1K FFT 112M [176.16M - 2035.14M] 4K:14:1K FFT 120M [188.74M - 2172.36M] 4K:15:1K 123M 1K:13:256 150M 1K:8:512 177M, 181M, 190M 1K:10:512 320M 1K:9:1K 480M 1K:13:1M 554M, 556M, 560M 1K:15:1K 1K head; 8, 9, 10, 13, 15 middle; 256, 512, or 1K tail combinations tried ok; 561M, 562.6M 4K:8:512 600M, 642M 4K:9:512 764M 4K:11:512 843M 4K:12:512 957M 4K:13:512 4K head; 8, 9, 11, 12 or 13 middle; 512 tail combinations tried failed. The common factor seems to be related to the 4K head. Some middles fall in both lists, as does the 512 tail. That would mean all fft lengths from 32M to 120M could have the -use STATS OUT_OF_RESOURCES issue. (edit) But it also means some lower than 32M could have it. And this is confirmed by running a single test on the same exponent that was ok with the default fft, with 4K:15:256: Code:
2020-10-26 12:16:31 gpuowl v7.1-1-g0f73d04 2020-10-26 12:16:31 config: -user kriesel -cpu asr2/radeonvii0 -d 1 -maxAlloc 14G -proof 9 -log 100000 -use NO_ASM,STATS -fft 4K:15:256 2020-10-26 12:16:31 device 1, unique id '' 2020-10-26 12:16:31 asr2/radeonvii0 560630051 FFT: 30M 4K:15:256 (17.82 bpw) 2020-10-26 12:16:36 asr2/radeonvii0 560630051 OpenCL args "-DEXP=560630051u -DWIDTH=4096u -DSMALL_HEIGHT=256u -DMIDDLE=15u -DAMDGPU=1 -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=3u -DMM2_CHAIN=3u -DMAX_ACCURACY=1 -DULTRA_TRIG=1 -DWEIGHT_STEP_MINUS_1=0x8.681b5a84b24dp-6 -DIWEIGHT_STEP_MINUS_1=-0xe.dc7abc0d7b388p-7 -DNO_ASM=1 -DSTATS=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-10-26 12:16:41 asr2/radeonvii0 560630051 OpenCL compilation in 4.72 s 2020-10-26 12:16:43 asr2/radeonvii0 560630051 maxAlloc: 14.0 GB 2020-10-26 12:16:43 asr2/radeonvii0 560630051 P1(3.2M) 4617012 bits 2020-10-26 12:16:44 asr2/radeonvii0 560630051 Acquired memory lock 'memlock-1' 2020-10-26 12:16:44 asr2/radeonvii0 560630051 P1(3.2M) using 100 buffers 2020-10-26 12:16:46 asr2/radeonvii0 560630051 P1(3.2M) releasing 100 buffers 2020-10-26 12:16:46 asr2/radeonvii0 560630051 Released memory lock 'memlock-1' 2020-10-26 12:16:47 asr2/radeonvii0 Exception gpu_error: OUT_OF_RESOURCES carryFused at clwrap.cpp:325 run 2020-10-26 12:16:47 asr2/radeonvii0 Bye It appears from extrapolation of FFT lengths' exponent limits that a modified gpuowl to attack P-1 or testing of F33 would require at least 480M length (perhaps as 4K:15:4K), and that would benefit from -use STATS checks being available both in development and end use. Last fiddled with by kriesel on 2020-10-26 at 17:27 |
![]() |
![]() |
![]() |
#149 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
116558 Posts |
![]()
Same asr2 system as in some previous reports, gpuowl-win 7.1-1
Code:
2020-10-28 10:15:50 asr2/radeonvii0 937156667 P1 Jacobi OK @ 12451200 d8514e45158c1e7d 2020-10-28 10:16:00 asr2/radeonvii0 937156667 OK 12500800 1.33% 9f520cdb7c997859 16645 us/it + check 6.66s + save 2.89s; ETA 178d 03:15 2020-10-28 10:16:00 asr2/radeonvii0 937156667 P2(8630000,258.9M) D=210, nBuf=22 Assertion failed: nBuf >= minBufsFor(D), file Pm1Plan.cpp, line 154 Resolved by changing to -maxAlloc 15G from 14G at least for now. Larger exponents are likely to run into trouble. The "Assertion failed" line is not present in gpuowl.log. It was captured from the console window. If practical it would be useful to have minBufs a user option. Or switch to the next worktodo line when a roadblock is hit. Or both. |
![]() |
![]() |
![]() |
#150 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
3·23·73 Posts |
![]()
How does one actually attempt to use the 2xSP?
There's nothing in the help output about it, or readme. (It seems a bit early to expect it to be automatically selecting depending on gpu model...) Last fiddled with by kriesel on 2020-10-28 at 19:07 |
![]() |
![]() |
![]() |
#151 | |
"Mihai Preda"
Apr 2015
135510 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#152 |
"Mihai Preda"
Apr 2015
5×271 Posts |
![]()
No, 2xSP is an experiment, can't be used for anything yet. Still a long way to go. (I was just measuring the precission that can be achieved *if* it was implemented)
Last fiddled with by preda on 2020-10-28 at 19:51 |
![]() |
![]() |
![]() |
#153 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
13AD16 Posts |
![]()
2 things:
V7.1-11 P-1 hiccup, so says Bye. (Sure would be nice if it would progress to the next worktodo entry instead of quitting.) Code:
2020-10-30 13:18:44 GpuOwl VERSION v7.1-11-g97cfbd2 2020-10-30 13:18:44 config: -user kriesel -cpu asr2/radeonvii2 -d 2 -maxAlloc 15G -proof 9 -log 100000 -use NO_ASM 2020-10-30 13:18:44 device 2, unique id '' 2020-10-30 13:18:44 asr2/radeonvii2 153021377 FFT: 8M 1K:8:512 (18.24 bpw) 2020-10-30 13:18:46 asr2/radeonvii2 153021377 OpenCL args "-DEXP=153021377u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=8u -DAMDGPU=1 -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=1u -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xb.10feac5431868p-4 -DIWEIGHT_STEP_MINUS_1=-0xd.156361ac01fe8p-5 -DNO_ASM=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-10-30 13:18:53 asr2/radeonvii2 153021377 OpenCL compilation in 7.71 s 2020-10-30 13:18:54 asr2/radeonvii2 153021377 maxAlloc: 15.0 GB 2020-10-30 13:18:54 asr2/radeonvii2 153021377 P1(1M) 1442134 bits 2020-10-30 13:18:54 asr2/radeonvii2 153021377 PRP starting from beginning 2020-10-30 13:18:54 asr2/radeonvii2 153021377 Acquired memory lock 'memlock-2' 2020-10-30 13:18:54 asr2/radeonvii2 153021377 P1(1M) using 460 buffers 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [0] 2d87ce26 != fffffffb 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [1] 7581b6da != 00000019 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [2] efca9779 != ffffff83 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [3] 03fe031c != 00000271 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [4] 21d014f0 != fffff3cb 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [5] 2100996a != 00003d09 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [6] 18280ed1 != fffeced3 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [7] fdd2f6a2 != 0005f5e1 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [8] 563a16a6 != ffe2329b 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [9] 1c97ee6a != 009502f9 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [10] 44fb8d20 != fd16f123 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [11] 0de14ac0 != 0e8d4a51 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [12] fd818931 != b73d8c6b 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [13] 058c6909 != 6bcc41e9 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [14] c4dfa66e != e502b673 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [15] 8739ef6f != 86f26fc1 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [16] 3f8bbf6e != 5d43d13b 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [17] 19ad23d9 != 2dace9d9 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [18] 6b9d9ac6 != 1b9f6ec3 2020-10-30 13:18:58 asr2/radeonvii2 153021377 [19] 729bd6ae != 75e2d631 2020-10-30 13:18:58 asr2/radeonvii2 153021377 fold() does not roundtrip 2020-10-30 13:18:58 asr2/radeonvii2 153021377 P1(1M) releasing 460 buffers 2020-10-30 13:18:59 asr2/radeonvii2 153021377 Released memory lock 'memlock-2' 2020-10-30 13:18:59 asr2/radeonvii2 Exiting because "fold roundtrip" 2020-10-30 13:18:59 asr2/radeonvii2 Bye It's confirmed by quick test that the -use STATS issue for 4K fft head extends down to the minimum length that offers it, 6M Code:
2020-10-30 12:52:38 gpuowl v6.11-364-g36f4e2a 2020-10-30 12:52:38 config: -user kriesel -cpu asr2/radeonvii -d 1 -maxAlloc 15000 -use NO_ASM,STATS -fft 4K:3:256 2020-10-30 12:52:38 device 1, unique id '' 2020-10-30 12:52:38 asr2/radeonvii 100759339 FFT: 6M 4K:3:256 (16.02 bpw) 2020-10-30 12:52:38 asr2/radeonvii Expected maximum carry32: 12AD0000 2020-10-30 12:52:39 asr2/radeonvii OpenCL args "-DEXP=100759339u -DWIDTH=4096u -DSMALL_HEIGHT=256u -DMIDDLE=3u -DPM1=1 -DAMDGPU=1 -DWEIGHT_STEP_MINUS_1=0xf.a9c658667f95p-4 -DIWEIGHT_STEP_MINUS_1=-0xf.d46dc4b3339dp-5 -DNO_ASM=1 -DSTATS=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-10-30 12:52:44 asr2/radeonvii OpenCL compilation in 4.50 s 2020-10-30 12:52:45 asr2/radeonvii 100759339 P1 B1=1000000, B2=30000000; 1442134 bits; starting at 1001 2020-10-30 12:52:45 asr2/radeonvii Exception gpu_error: OUT_OF_RESOURCES carryFused at clwrap.cpp:325 run 2020-10-30 12:52:45 asr2/radeonvii Bye 2020-10-30 12:53:07 gpuowl v6.11-364-g36f4e2a 2020-10-30 12:53:07 config: -user kriesel -cpu asr2/radeonvii -d 1 -maxAlloc 15000 -use NO_ASM -fft 4K:3:256 2020-10-30 12:53:07 device 1, unique id '' 2020-10-30 12:53:07 asr2/radeonvii 100759339 FFT: 6M 4K:3:256 (16.02 bpw) 2020-10-30 12:53:07 asr2/radeonvii Expected maximum carry32: 12AD0000 2020-10-30 12:53:09 asr2/radeonvii OpenCL args "-DEXP=100759339u -DWIDTH=4096u -DSMALL_HEIGHT=256u -DMIDDLE=3u -DPM1=1 -DAMDGPU=1 -DWEIGHT_STEP_MINUS_1=0xf.a9c658667f95p-4 -DIWEIGHT_STEP_MINUS_1=-0xf.d46dc4b3339dp-5 -DNO_ASM=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-10-30 12:53:13 asr2/radeonvii OpenCL compilation in 4.26 s 2020-10-30 12:53:14 asr2/radeonvii 100759339 P1 B1=1000000, B2=30000000; 1442134 bits; starting at 1001 2020-10-30 12:53:24 asr2/radeonvii 100759339 P1 10000 0.69%; 1161 us/it; ETA 0d 00:28; 849bb19b9a9f4ce3 2020-10-30 12:53:36 asr2/radeonvii 100759339 P1 20000 1.39%; 1158 us/it; ETA 0d 00:27; 556f71d2a8cf201c Last fiddled with by kriesel on 2020-10-30 at 18:42 |
![]() |
![]() |
![]() |
#154 |
"Mihai Preda"
Apr 2015
5×271 Posts |
![]()
Please upgrade to v7.2 which fixes a proof generation bug.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GpuOwl PRP-Proof changes | preda | GpuOwl | 20 | 2020-10-17 06:51 |
gpuowl: runtime error | SELROC | GpuOwl | 59 | 2020-10-02 03:56 |
gpuOWL for Wagstaff | GP2 | GpuOwl | 22 | 2020-06-13 16:57 |
gpuowl tuning | M344587487 | GpuOwl | 14 | 2018-12-29 08:11 |
How to interface gpuOwl with PrimeNet | preda | PrimeNet | 2 | 2017-10-07 21:32 |