![]() |
![]() |
#2069 | |
"Mihai Preda"
Apr 2015
3·11·41 Posts |
![]() Quote:
This same change has been comitted to gpuowl, so this should be fixed after a re-checkout. |
|
![]() |
![]() |
![]() |
#2070 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,009 Posts |
![]()
v6.11-255 on Win7 x64, RX550 did not like the default fft at all. +1 etc syntax is apparently gone and if used, gpuowl fails in an interesting way. A quick read of the help output set it right and on its way with the second fft specification for the fft length.
Code:
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550 C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win 2020-04-10 12:09:43 gpuowl v6.11-255-g81fa7c3 2020-04-10 12:09:43 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM 2020-04-10 12:09:43 device 1, unique id '' 2020-04-10 12:09:43 condorella/rx550 94741139 FFT: 5M 1K:10:256 (18.07 bpw) 2020-04-10 12:09:43 condorella/rx550 Expected maximum carry32: 461E0000 2020-04-10 12:09:46 condorella/rx550 OpenCL args "-DEXP=94741139u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xf.3cd1fc041 1148p-3 -DIWEIGHT_STEP=0x8.66790bf53aca8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 " 2020-04-10 12:09:53 condorella/rx550 OpenCL compilation in 6.96 s 2020-04-10 12:10:09 condorella/rx550 94741139 EE 0 loaded: blockSize 400, 0000000000000000 (expected 0000000000000003) 2020-04-10 12:10:09 condorella/rx550 Exiting because "error on load" 2020-04-10 12:10:09 condorella/rx550 Bye C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>g611 C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550 C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win 2020-04-10 12:10:51 gpuowl v6.11-255-g81fa7c3 2020-04-10 12:10:51 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM -fft +1 2020-04-10 12:10:51 device 1, unique id '' 2020-04-10 12:10:51 condorella/rx550 94741139 FFT: 128K 256:1:256 (722.82 bpw) 2020-04-10 12:10:51 condorella/rx550 FFT size too small for exponent (722.82 bits/word). 2020-04-10 12:10:51 condorella/rx550 Exiting because "FFT size too small" 2020-04-10 12:10:51 condorella/rx550 Bye C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>g611 C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550 C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win 2020-04-10 12:12:45 gpuowl v6.11-255-g81fa7c3 2020-04-10 12:12:45 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM -fft 1K:5:512 2020-04-10 12:12:45 device 1, unique id '' 2020-04-10 12:12:45 condorella/rx550 94741139 FFT: 5M 1K:5:512 (18.07 bpw) 2020-04-10 12:12:45 condorella/rx550 Expected maximum carry32: 461E0000 2020-04-10 12:12:47 condorella/rx550 OpenCL args "-DEXP=94741139u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=5u -DWEIGHT_STEP=0xf.3cd1fc0411 148p-3 -DIWEIGHT_STEP=0x8.66790bf53aca8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 - DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 " 2020-04-10 12:12:55 condorella/rx550 OpenCL compilation in 8.18 s 2020-04-10 12:13:02 condorella/rx550 94741139 OK 0 loaded: blockSize 400, 0000000000000003 2020-04-10 12:13:19 condorella/rx550 94741139 OK 800 0.00%; 14229 us/it; ETA 15d 14:28; 738c4e015132f834 (check 5.86s) 2020-04-10 13:00:54 condorella/rx550 94741139 OK 200000 0.21%; 14317 us/it; ETA 15d 15:59; e0463c77c58b0105 (check 5.87s) 2020-04-10 13:48:40 condorella/rx550 94741139 OK 400000 0.42%; 14319 us/it; ETA 15d 15:14; 5b1fe09cbecb5e40 (check 5.89s) 2020-04-10 14:36:27 condorella/rx550 94741139 OK 600000 0.63%; 14321 us/it; ETA 15d 14:29; 5f62cf32c024e1a2 (check 5.87s) 2020-04-10 15:24:15 condorella/rx550 94741139 OK 800000 0.84%; 14322 us/it; ETA 15d 13:44; 3dd122479d7dde25 (check 5.88s) 2020-04-10 16:12:02 condorella/rx550 94741139 OK 1000000 1.06%; 14319 us/it; ETA 15d 12:52; e44ae2f6c9046662 (check 5.87s) 2020-04-10 16:59:49 condorella/rx550 94741139 OK 1200000 1.27%; 14320 us/it; ETA 15d 12:06; b3a0108ad221f8fd (check 5.88s) 2020-04-10 17:47:36 condorella/rx550 94741139 OK 1400000 1.48%; 14319 us/it; ETA 15d 11:17; 6077a7f20c7ee45c (check 5.88s) 2020-04-10 17:49:53 condorella/rx550 Stopping, please wait.. 2020-04-10 17:50:05 condorella/rx550 94741139 OK 1410000 1.49%; 14328 us/it; ETA 15d 11:28; e02e0d0dca18d9f5 (check 5.87s) 2020-04-10 17:50:05 condorella/rx550 Exiting because "stop requested" 2020-04-10 17:50:05 condorella/rx550 Bye |
![]() |
![]() |
![]() |
#2071 |
Romulan Interpreter
Jun 2011
Thailand
222138 Posts |
![]()
Could you (or kracker) please rebuild with the last change from preda, and repost?
(I am not yet able to build gpuowl, I mean, I didn't try yet, but I will give it few tests as long as it can LL). Last fiddled with by LaurV on 2020-04-11 at 14:27 |
![]() |
![]() |
![]() |
#2072 | |
Einyen
Dec 2003
Denmark
2×3×7×73 Posts |
![]() Quote:
83174053 83180563 Last fiddled with by James Heinrich on 2020-04-11 at 15:01 Reason: fixed broken exponent links |
|
![]() |
![]() |
![]() |
#2073 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,009 Posts |
![]()
Latest available commit as of ~12 minutes before this post. Usual shower of warning in the build log; help output included; no testing performed. Enjoy, and please report here any issues.
Last fiddled with by kriesel on 2020-04-11 at 16:05 |
![]() |
![]() |
![]() |
#2074 |
"Mr. Meeseeks"
Jan 2012
California, USA
32·241 Posts |
![]()
Just now, I made the very stupid mistake of not checking a few DC residues before submitting a batch...
![]() I can redo them - or whatever is best. Nvidia P100 in colab. gpuowl v6.11-252-gaf403e2 OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8 Code:
51509873 51491101 51491059 51490883 51490843 51491267 51491119 51509257 51490799 51490723 51490343 51490339 51508747 58650941 51488837 51491983 51491773 51491731 |
![]() |
![]() |
![]() |
#2075 | ||
"Mihai Preda"
Apr 2015
3×11×41 Posts |
![]()
It seems the problem is associated with the setup
Quote:
One way to check whether the FFT is broken is to run a few PRP iterations before starting the LL, e.g. ./gpuowl -prp 51509873 Quote:
|
||
![]() |
![]() |
![]() |
#2076 |
"Mr. Meeseeks"
Jan 2012
California, USA
32·241 Posts |
![]()
with the previously set settings I'm getting an immediate EE... seems to work with no -use arguments.
Code:
/content/drive/My Drive/gpuowl-colab 2020-04-12 02:08:53 gpuowl v6.11-252-gaf403e2 2020-04-12 02:08:53 config: -user kracker -cpu pce 2020-04-12 02:08:53 config: -ll 51509873 2020-04-12 02:08:53 device 0, unique id '' 2020-04-12 02:08:53 pce 51509873 FFT: 2.75M 256:11:512 (17.86 bpw) 2020-04-12 02:08:53 pce Expected maximum carry32: 2B810000 2020-04-12 02:08:54 pce OpenCL args "-DEXP=51509873u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DWEIGHT_STEP=0x1.19794ea80bcb4p+0 -DIWEIGHT_STEP=0x1.d1a9c3958d155p-1 -DWEIGHT_BIGSTEP=0x1.ae89f995ad3adp+0 -DIWEIGHT_BIGSTEP=0x1.306fe0a31b715p-1 -DPM1=0 -cl-fast-relaxed-math -cl-std=CL2.0 " 2020-04-12 02:08:57 pce 2020-04-12 02:08:57 pce OpenCL compilation in 2.80 s 2020-04-12 02:08:57 pce 51509873 LL 0 loaded: 0000000000000004 2020-04-12 02:09:48 pce 51509873 LL 100000 0.19%; 509 us/it; ETA 0d 07:16; d4bf953f17f5dd56 2020-04-12 02:10:15 pce Stopping, please wait.. 2020-04-12 02:10:15 pce 51509873 LL 154000 0.30%; 510 us/it; ETA 0d 07:17; be98350bc1fe8687 2020-04-12 02:10:15 pce Exiting because "stop requested" 2020-04-12 02:10:15 pce Bye Code:
/content/drive/My Drive/gpuowl-colab 2020-04-12 02:12:19 gpuowl v6.11-252-gaf403e2 2020-04-12 02:12:19 config: -user kracker -cpu pce 2020-04-12 02:12:19 config: -use OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8 -ll 51509873 2020-04-12 02:12:19 device 0, unique id '' 2020-04-12 02:12:19 pce 51509873 FFT: 2.75M 256:11:512 (17.86 bpw) 2020-04-12 02:12:19 pce Expected maximum carry32: 2B810000 2020-04-12 02:12:19 pce OpenCL args "-DEXP=51509873u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DWEIGHT_STEP=0x1.19794ea80bcb4p+0 -DIWEIGHT_STEP=0x1.d1a9c3958d155p-1 -DWEIGHT_BIGSTEP=0x1.ae89f995ad3adp+0 -DIWEIGHT_BIGSTEP=0x1.306fe0a31b715p-1 -DPM1=0 -DIN_SIZEX=8 -DIN_SPACING=8 -DOUT_SIZEX=16 -cl-fast-relaxed-math -cl-std=CL2.0 " 2020-04-12 02:12:19 pce 2020-04-12 02:12:19 pce OpenCL compilation in 0.01 s 2020-04-12 02:12:19 pce 51509873 LL 0 loaded: 0000000000000004 2020-04-12 02:13:09 pce 51509873 LL 100000 0.19%; 496 us/it; ETA 0d 07:05; a2891146b3ded4b9 2020-04-12 02:13:16 pce Stopping, please wait.. 2020-04-12 02:13:17 pce 51509873 LL 115000 0.22%; 502 us/it; ETA 0d 07:10; 42848d9cb649a731 2020-04-12 02:13:17 pce Exiting because "stop requested" 2020-04-12 02:13:17 pce Bye |
![]() |
![]() |
![]() |
#2077 |
Einyen
Dec 2003
Denmark
2·3·7·73 Posts |
![]()
I created a script to test the speed of a bunch of combinations of the OUT_WG,OUT_SIZEX,OUT_SPACING,IN_WG,IN_SIZEX,IN_SPACING variables for the LL test.
It seems for LL test there is no block to stop combinations that will not work. Instead it zeros the residue. For example these: Code:
./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=4 ./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=128 ./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=8,IN_WG=64,IN_SIZEX=128,IN_SPACING=64 ./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=64 Output: 2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 OpenCL compilation in 2.22 s 2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 95000011 LL 0 loaded: 0000000000000004 2020-04-13 22:32:41 Tesla P100-PCIE-16GB-0 95000011 LL 10000 0.01%; 641 us/it; ETA 0d 16:54; fffffffffffffffd 2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 Stopping, please wait.. 2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 95000011 LL 14000 0.01%; 657 us/it; ETA 0d 17:20; fffffffffffffffd Last fiddled with by ATH on 2020-04-13 at 22:46 |
![]() |
![]() |
![]() |
#2078 | |
"Mihai Preda"
Apr 2015
3·11·41 Posts |
![]()
LL is "naked", no error check at all. Please try/tune combinations on PRP, which will help detect the invalid ones. Only after validation with PRP use any combination for LL.
Quote:
|
|
![]() |
![]() |
![]() |
#2079 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,009 Posts |
![]() Quote:
0x0000000000000000, 0x0000000000000002, 0xffffffff80000000, 0xfffffffffffffffd, and excessive roundoff error. Gpuowl checks bits/word. A memory copy fail could give 0; +-2 values come from the residue getting zeroed and then the -2 and the squaring; the 33-bits-set value 0xffffffff80000000 comes from using far too short an fft length as was seen in both cllucas 1.02 and CUDALucas v2.03. https://mersenneforum.org/showpost.p...&postcount=232 https://mersenneforum.org/showpost.p...&postcount=299 Last fiddled with by kriesel on 2020-04-14 at 03:42 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1668 | 2020-12-22 15:38 |
GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |