mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-04-11, 09:18   #2069
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55F16 Posts
Default

Quote:
Originally Posted by kracker View Post
Tried to submit an LL result, got "Did not understand 1 lines."

{"exponent":"54907981", "worktype":"LL", "status":"C", "program":{"name":"gpuowl", "version":"v6.11-252-gaf403e2"}, "timestamp":"2020-04-10 14:05:02 UTC", "user":"kracker", "computer":"core", "aid":"xxxxxxxxxx", "fft-length":3145728, "res64":"xxxxxxxxxxxxx", "offset":0}
Please replace "offset" with "shift-count" and re-submit the result -- it should be accepted after this change.

This same change has been comitted to gpuowl, so this should be fixed after a re-checkout.
preda is offline   Reply With Quote
Old 2020-04-11, 14:03   #2070
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

172316 Posts
Default

Quote:
Originally Posted by kriesel View Post
Latest commit build, build log, help output, etc.
v6.11-255 on Win7 x64, RX550 did not like the default fft at all. +1 etc syntax is apparently gone and if used, gpuowl fails in an interesting way. A quick read of the help output set it right and on its way with the second fft specification for the fft length.
Code:
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550

C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win
2020-04-10 12:09:43 gpuowl v6.11-255-g81fa7c3
2020-04-10 12:09:43 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM
2020-04-10 12:09:43 device 1, unique id ''
2020-04-10 12:09:43 condorella/rx550 94741139 FFT: 5M 1K:10:256 (18.07 bpw)
2020-04-10 12:09:43 condorella/rx550 Expected maximum carry32: 461E0000
2020-04-10 12:09:46 condorella/rx550 OpenCL args "-DEXP=94741139u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xf.3cd1fc041
1148p-3 -DIWEIGHT_STEP=0x8.66790bf53aca8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1
-DNO_ASM=1  -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-04-10 12:09:53 condorella/rx550 OpenCL compilation in 6.96 s
2020-04-10 12:10:09 condorella/rx550 94741139 EE        0 loaded: blockSize 400, 0000000000000000 (expected 0000000000000003)
2020-04-10 12:10:09 condorella/rx550 Exiting because "error on load"
2020-04-10 12:10:09 condorella/rx550 Bye
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>g611

C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550

C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win
2020-04-10 12:10:51 gpuowl v6.11-255-g81fa7c3
2020-04-10 12:10:51 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM -fft +1
2020-04-10 12:10:51 device 1, unique id ''
2020-04-10 12:10:51 condorella/rx550 94741139 FFT: 128K 256:1:256 (722.82 bpw)
2020-04-10 12:10:51 condorella/rx550 FFT size too small for exponent (722.82 bits/word).
2020-04-10 12:10:51 condorella/rx550 Exiting because "FFT size too small"
2020-04-10 12:10:51 condorella/rx550 Bye
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>g611

C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>title gpuowl-v6.11-255-g81fa7c3/rx550

C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-255-g81fa7c3>gpuowl-win
2020-04-10 12:12:45 gpuowl v6.11-255-g81fa7c3
2020-04-10 12:12:45 config: -device 1 -user kriesel -cpu condorella/rx550 -yield -maxAlloc 3600 -use NO_ASM -fft 1K:5:512
2020-04-10 12:12:45 device 1, unique id ''
2020-04-10 12:12:45 condorella/rx550 94741139 FFT: 5M 1K:5:512 (18.07 bpw)
2020-04-10 12:12:45 condorella/rx550 Expected maximum carry32: 461E0000
2020-04-10 12:12:47 condorella/rx550 OpenCL args "-DEXP=94741139u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=5u -DWEIGHT_STEP=0xf.3cd1fc0411
148p-3 -DIWEIGHT_STEP=0x8.66790bf53aca8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -
DNO_ASM=1  -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-04-10 12:12:55 condorella/rx550 OpenCL compilation in 8.18 s
2020-04-10 12:13:02 condorella/rx550 94741139 OK        0 loaded: blockSize 400, 0000000000000003
2020-04-10 12:13:19 condorella/rx550 94741139 OK      800   0.00%; 14229 us/it; ETA 15d 14:28; 738c4e015132f834 (check 5.86s)
2020-04-10 13:00:54 condorella/rx550 94741139 OK   200000   0.21%; 14317 us/it; ETA 15d 15:59; e0463c77c58b0105 (check 5.87s)
2020-04-10 13:48:40 condorella/rx550 94741139 OK   400000   0.42%; 14319 us/it; ETA 15d 15:14; 5b1fe09cbecb5e40 (check 5.89s)
2020-04-10 14:36:27 condorella/rx550 94741139 OK   600000   0.63%; 14321 us/it; ETA 15d 14:29; 5f62cf32c024e1a2 (check 5.87s)
2020-04-10 15:24:15 condorella/rx550 94741139 OK   800000   0.84%; 14322 us/it; ETA 15d 13:44; 3dd122479d7dde25 (check 5.88s)
2020-04-10 16:12:02 condorella/rx550 94741139 OK  1000000   1.06%; 14319 us/it; ETA 15d 12:52; e44ae2f6c9046662 (check 5.87s)
2020-04-10 16:59:49 condorella/rx550 94741139 OK  1200000   1.27%; 14320 us/it; ETA 15d 12:06; b3a0108ad221f8fd (check 5.88s)
2020-04-10 17:47:36 condorella/rx550 94741139 OK  1400000   1.48%; 14319 us/it; ETA 15d 11:17; 6077a7f20c7ee45c (check 5.88s)
2020-04-10 17:49:53 condorella/rx550 Stopping, please wait..
2020-04-10 17:50:05 condorella/rx550 94741139 OK  1410000   1.49%; 14328 us/it; ETA 15d 11:28; e02e0d0dca18d9f5 (check 5.87s)
2020-04-10 17:50:05 condorella/rx550 Exiting because "stop requested"
2020-04-10 17:50:05 condorella/rx550 Bye
kriesel is offline   Reply With Quote
Old 2020-04-11, 14:26   #2071
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

24×613 Posts
Default

Quote:
Originally Posted by kriesel View Post
Latest commit build, build log, help output, etc.
Could you (or kracker) please rebuild with the last change from preda, and repost?

(I am not yet able to build gpuowl, I mean, I didn't try yet, but I will give it few tests as long as it can LL).

Last fiddled with by LaurV on 2020-04-11 at 14:27
LaurV is offline   Reply With Quote
Old 2020-04-11, 14:43   #2072
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·3·13·41 Posts
Default

Quote:
Originally Posted by preda View Post
Please replace "offset" with "shift-count" and re-submit the result -- it should be accepted after this change.

This same change has been comitted to gpuowl, so this should be fixed after a re-checkout.
Thanks, that worked. 2 successful double checks from gpuowl:
83174053
83180563

Last fiddled with by James Heinrich on 2020-04-11 at 15:01 Reason: fixed broken exponent links
ATH is offline   Reply With Quote
Old 2020-04-11, 16:04   #2073
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,923 Posts
Default Win7 x64 build of gpuowl v6.11-257

Latest available commit as of ~12 minutes before this post. Usual shower of warning in the build log; help output included; no testing performed. Enjoy, and please report here any issues.
Attached Files
File Type: txt build-log.txt (8.3 KB, 101 views)
File Type: 7z gpuowl-v6.11-257-g39fc002.7z (464.4 KB, 107 views)

Last fiddled with by kriesel on 2020-04-11 at 16:05
kriesel is offline   Reply With Quote
Old 2020-04-11, 17:07   #2074
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

32×241 Posts
Default

Just now, I made the very stupid mistake of not checking a few DC residues before submitting a batch...
I can redo them - or whatever is best.
Nvidia P100 in colab.
gpuowl v6.11-252-gaf403e2
OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8
Code:
51509873
51491101
51491059
51490883
51490843
51491267
51491119
51509257
51490799
51490723
51490343
51490339
51508747
58650941
51488837
51491983
51491773
51491731
Attached Thumbnails
Click image for larger version

Name:	facepalm.png
Views:	96
Size:	44.1 KB
ID:	22014  
kracker is offline   Reply With Quote
Old 2020-04-11, 21:15   #2075
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010111112 Posts
Default

It seems the problem is associated with the setup
Quote:
OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8
Did this setup work for another exponent?

One way to check whether the FFT is broken is to run a few PRP iterations before starting the LL, e.g.
./gpuowl -prp 51509873

Quote:
Originally Posted by kracker View Post
Just now, I made the very stupid mistake of not checking a few DC residues before submitting a batch...
I can redo them - or whatever is best.
Nvidia P100 in colab.
gpuowl v6.11-252-gaf403e2
OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8
Code:
51509873
51491101
51491059
51490883
51490843
51491267
51491119
51509257
51490799
51490723
51490343
51490339
51508747
58650941
51488837
51491983
51491773
51491731
preda is offline   Reply With Quote
Old 2020-04-12, 02:17   #2076
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

32×241 Posts
Default

with the previously set settings I'm getting an immediate EE... seems to work with no -use arguments.
Code:
/content/drive/My Drive/gpuowl-colab
2020-04-12 02:08:53 gpuowl v6.11-252-gaf403e2
2020-04-12 02:08:53 config: -user kracker -cpu pce
2020-04-12 02:08:53 config: -ll 51509873 
2020-04-12 02:08:53 device 0, unique id ''
2020-04-12 02:08:53 pce 51509873 FFT: 2.75M 256:11:512 (17.86 bpw)
2020-04-12 02:08:53 pce Expected maximum carry32: 2B810000
2020-04-12 02:08:54 pce OpenCL args "-DEXP=51509873u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DWEIGHT_STEP=0x1.19794ea80bcb4p+0 -DIWEIGHT_STEP=0x1.d1a9c3958d155p-1 -DWEIGHT_BIGSTEP=0x1.ae89f995ad3adp+0 -DIWEIGHT_BIGSTEP=0x1.306fe0a31b715p-1 -DPM1=0  -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-04-12 02:08:57 pce 

2020-04-12 02:08:57 pce OpenCL compilation in 2.80 s
2020-04-12 02:08:57 pce 51509873 LL        0 loaded: 0000000000000004
2020-04-12 02:09:48 pce 51509873 LL   100000   0.19%;  509 us/it; ETA 0d 07:16; d4bf953f17f5dd56
2020-04-12 02:10:15 pce Stopping, please wait..
2020-04-12 02:10:15 pce 51509873 LL   154000   0.30%;  510 us/it; ETA 0d 07:17; be98350bc1fe8687
2020-04-12 02:10:15 pce Exiting because "stop requested"
2020-04-12 02:10:15 pce Bye
Code:
/content/drive/My Drive/gpuowl-colab
2020-04-12 02:12:19 gpuowl v6.11-252-gaf403e2
2020-04-12 02:12:19 config: -user kracker -cpu pce
2020-04-12 02:12:19 config: -use OUT_SIZEX=16,IN_SIZEX=8,IN_SPACING=8 -ll 51509873 
2020-04-12 02:12:19 device 0, unique id ''
2020-04-12 02:12:19 pce 51509873 FFT: 2.75M 256:11:512 (17.86 bpw)
2020-04-12 02:12:19 pce Expected maximum carry32: 2B810000
2020-04-12 02:12:19 pce OpenCL args "-DEXP=51509873u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DWEIGHT_STEP=0x1.19794ea80bcb4p+0 -DIWEIGHT_STEP=0x1.d1a9c3958d155p-1 -DWEIGHT_BIGSTEP=0x1.ae89f995ad3adp+0 -DIWEIGHT_BIGSTEP=0x1.306fe0a31b715p-1 -DPM1=0 -DIN_SIZEX=8 -DIN_SPACING=8 -DOUT_SIZEX=16  -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-04-12 02:12:19 pce 

2020-04-12 02:12:19 pce OpenCL compilation in 0.01 s
2020-04-12 02:12:19 pce 51509873 LL        0 loaded: 0000000000000004
2020-04-12 02:13:09 pce 51509873 LL   100000   0.19%;  496 us/it; ETA 0d 07:05; a2891146b3ded4b9
2020-04-12 02:13:16 pce Stopping, please wait..
2020-04-12 02:13:17 pce 51509873 LL   115000   0.22%;  502 us/it; ETA 0d 07:10; 42848d9cb649a731
2020-04-12 02:13:17 pce Exiting because "stop requested"
2020-04-12 02:13:17 pce Bye
kracker is offline   Reply With Quote
Old 2020-04-13, 22:44   #2077
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·3·13·41 Posts
Default

I created a script to test the speed of a bunch of combinations of the OUT_WG,OUT_SIZEX,OUT_SPACING,IN_WG,IN_SIZEX,IN_SPACING variables for the LL test.

It seems for LL test there is no block to stop combinations that will not work. Instead it zeros the residue. For example these:

Code:
./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=4

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=128

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=8,IN_WG=64,IN_SIZEX=128,IN_SPACING=64

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=64

Output:

2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 OpenCL compilation in 2.22 s
2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 95000011 LL        0 loaded: 0000000000000004
2020-04-13 22:32:41 Tesla P100-PCIE-16GB-0 95000011 LL    10000   0.01%;  641 us/it; ETA 0d 16:54; fffffffffffffffd
2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 Stopping, please wait..
2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 95000011 LL    14000   0.01%;  657 us/it; ETA 0d 17:20; fffffffffffffffd

Last fiddled with by ATH on 2020-04-13 at 22:46
ATH is offline   Reply With Quote
Old 2020-04-13, 23:17   #2078
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

53×11 Posts
Default

LL is "naked", no error check at all. Please try/tune combinations on PRP, which will help detect the invalid ones. Only after validation with PRP use any combination for LL.

Quote:
Originally Posted by ATH View Post
I created a script to test the speed of a bunch of combinations of the OUT_WG,OUT_SIZEX,OUT_SPACING,IN_WG,IN_SIZEX,IN_SPACING variables for the LL test.

It seems for LL test there is no block to stop combinations that will not work. Instead it zeros the residue. For example these:

Code:
./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=4

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=256,OUT_SIZEX=4,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=128

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=8,IN_WG=64,IN_SIZEX=128,IN_SPACING=64

./gpuowlLL -ll 95000011 -iters 30000 -log 10000 -use CARRY32,ORIG_SLOWTRIG,OUT_WG=64,OUT_SIZEX=128,OUT_SPACING=128,IN_WG=64,IN_SIZEX=128,IN_SPACING=64

Output:

2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 OpenCL compilation in 2.22 s
2020-04-13 22:32:34 Tesla P100-PCIE-16GB-0 95000011 LL        0 loaded: 0000000000000004
2020-04-13 22:32:41 Tesla P100-PCIE-16GB-0 95000011 LL    10000   0.01%;  641 us/it; ETA 0d 16:54; fffffffffffffffd
2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 Stopping, please wait..
2020-04-13 22:32:43 Tesla P100-PCIE-16GB-0 95000011 LL    14000   0.01%;  657 us/it; ETA 0d 17:20; fffffffffffffffd
preda is offline   Reply With Quote
Old 2020-04-14, 03:18   #2079
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10111001000112 Posts
Default

Quote:
Originally Posted by preda View Post
LL is "naked", no error check at all. Please try/tune combinations on PRP, which will help detect the invalid ones. Only after validation with PRP use any combination for LL.
Yikes, that means the LL side of gpuowl will be less reliable than CUDALucas v2.06, which has checks for known bad residues seen to occur,
0x0000000000000000, 0x0000000000000002, 0xffffffff80000000, 0xfffffffffffffffd, and excessive roundoff error. Gpuowl checks bits/word.

A memory copy fail could give 0; +-2 values come from the residue getting zeroed and then the -2 and the squaring; the 33-bits-set value 0xffffffff80000000 comes from using far too short an fft length as was seen in both cllucas 1.02 and CUDALucas v2.03.
https://mersenneforum.org/showpost.p...&postcount=232
https://mersenneforum.org/showpost.p...&postcount=299

Last fiddled with by kriesel on 2020-04-14 at 03:42
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1680 2021-09-13 17:01
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 04:15.


Fri Dec 3 04:15:41 UTC 2021 up 132 days, 22:44, 0 users, load averages: 1.26, 1.27, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.