mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2020-06-05, 17:10   #2267
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24·5·13 Posts
Default

Quote:
Originally Posted by Xebecer View Post
-pool C:\Users\xebec\Desktop\GPUOwl Shared in the config.txt file. I get:



Can't open 'C' (mode 'ab')
Exception NSt10filesystem7__cxx1116filesystem_errorE" filesystem error" can't open file" No error [C"\Users\xebec\Desktop\GPUOwl_Shared/]
Bye
It's pretty messed-up: something replaced the ':' character in the error message with the " (quote) character. It's also missing an expected "results.txt" at the end. For comparison, this is how I'd expect that error to look:
Quote:
2020-06-06 08:06:16 Can't open 'C:\Foo\bar/results.txt' (mode 'ab')
2020-06-06 08:06:16 Exception NSt10filesystem7__cxx1116filesystem_errorE: filesystem error: can't open file: Success [C:\Foo\bar/results.txt]
Maybe you could attach the full log of gpuowl start-up, which should contain information about the full config options that it sees.
preda is offline   Reply With Quote
Old 2020-06-05, 17:15   #2268
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24×5×13 Posts
Default

Quote:
Originally Posted by kriesel View Post
Is that so regardless of gpu model? I note Radeon VII gpus have a serial number built in. (Cpuid hwinfo produced this on Windows. RX480 and RX550 do not have such serial numbers.)
No, I would expect that the availability of unique_id depends on the GPU model. RadeonVII has it, others may not have it. If the file /sys/class/drm/cardN/device/unique_id is there it's likely to have the id information, otherwise not.
preda is offline   Reply With Quote
Old 2020-06-05, 19:01   #2269
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3·1,283 Posts
Default gpuowl-win v6.11-312-gc69350e failed to build

Same ambiguous overload error as -310 and -311.

I would try an earlier commit but don't know the proper magic git incantation.
Attached Files
File Type: txt build-log.txt (10.6 KB, 1 views)
kriesel is online now   Reply With Quote
Old 2020-06-05, 20:15   #2270
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1111000010012 Posts
Default Draft for gpuowl radeon vii tuning

This may become a gpuowl reference thread post later.

#1, get the latest gpuowl version. There have been lots of performance improvements lately, and some added features (LLDC, Jacobi check; PRP proof if you can build it)

It's unclear how much performance depends on PCIe width, version, extender type if any, etc.

Before starting tuning, document baseline performance and configuration.

Do parameter testing while running PRP with GEC, to reliably detect unreliability.

Run a gpu monitoring application such as GPU-Z, nvidia-smi or rocm-smi if possible
Initially changes can be made rather quickly, and a quick GEC error is quick feedback you've been too aggressive with the settings, but to ensure the gpu is reliable, final selections should be watched for hours or days of error-free operation. Only after days error free is achieved, should any LL or P-1 runs be attempted.

Increase memory clock from default (some are able to run as high as 1200Mhz, +20% above nominal)
On my setup, I see about 1% performance gain for 5% memory clock increase.
(This may be limited because it's in a warm area.)

Undervolting. Which voltage(s) to adjust, what are people getting away with relative to what original settings?
https://mersenneforum.org/showpost.p...postcount=1630 is unclear to me
Presumably it's what GPU-z calls vddc, the only voltage displayed, and what Radeon software Adrenaline 2020 calls gpu voltage, the only one offered for modification.

What's the benefit of undervolting:
Allowing higher clocks during thermal limiting or power limiting?
Saving on power cost?

linux: to manage power requirements, use lower sclk. Windows equivalent: directly adjust gpu clock
sclk 5 (highest) 1684
sclk 4 ~1547 Mhz per philf https://mersenneforum.org/showpost.p...postcount=1698
sclk 3 1373
sclk 2 ?
sclk 1?
sclk 0?
Apparently these vary a bit; preda gave 1520 for sclk 4.
In https://mersenneforum.org/showpost.p...postcount=1630 preda gives an example bash script and describes parameters. see also https://mersenneforum.org/showpost.p...postcount=1632

Fiddle with fan curve? Default on Windows is only 75% fan at 105C hot spot temp (corresponding to ~80C nominal gpu temp) (I see references such as in https://mersenneforum.org/showpost.p...postcount=1642 by linux users to setting fan well above 100. What are the units of setfan in linux?)

AMD Radeon software on Windows also allows to set a power limit up to +20% or down to -20% relative to nominal

Name and save a profile with the resulting gpus-specific tuning settings in the Windows Adrenalin software, so it can easily be reloaded after a system start.

After other tuning is done, if you have enough similar work, run two instances per gpu, for a bit more throughput at the cost of about double latency. Is that still worthwhile doing with current commits?

Same computation type, PRP & PRP, or LLDC & LLDC, same fft length recommended. Ideally they will be a bit out of phase, so that when one instance is writing to disk or communicating between gpu ram and system/cpu ram, the other is utilizing the gpu computing resources.
If work is too dissimilar, two instances will have lower combined throughput than one. Try, measure, adjust. See https://mersenneforum.org/showpost.p...postcount=1507

Linux is supposedly faster than Windows, perhaps due to lower driver overhead. Does anyone have numbers for that on the same hardware?

In bitcoin mining multi-gpu setup howtos, they advise turning off various things in the BIOS, as part of the process of preparing the system to support a large number of gpus. Is any of that known to be relevant or irrelevant to gpuowl performance?

What else?

Last fiddled with by kriesel on 2020-06-05 at 20:20
kriesel is online now   Reply With Quote
Old 2020-06-05, 22:46   #2271
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2·3·7·179 Posts
Default

We have gpuowl working on our (single-PCI slot) W5500. (We are also using it as our main display card.)

We had to change the makefile's "LIBPATH" to a different place: LIBPATH = -L/opt/amdgpu-pro/lib64 -L.

We have a sample test running. We don't yet know how to decipher the information presented, but at least it works!

We are very surprised that gpuowl runs as our normal user.

As far as performance, how does this look?
Code:
2020-06-05 17:13:13 gpuowl v6.11-312-gc69350e-dirty
2020-06-05 17:13:13 Note: not found 'config.txt'
2020-06-05 17:13:13 device 0, unique id ''
2020-06-05 17:13:13 gfx1012-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-06-05 17:13:13 gfx1012-0 Expected maximum carry32: 583B0000
2020-06-05 17:13:13 gfx1012-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DWEIGHT_STEP=0x1.5621686be7602p+0 -DIWEIGHT_STEP=0x1.7f1af377e822p-1 -DWEIGHT_BIGSTEP=0x1.306fe0a31b715p+0 -DIWEIGHT_BIGSTEP=0x1.ae89f995ad3adp-1 -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=1u  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-06-05 17:13:16 gfx1012-0 OpenCL compilation in 3.10 s
2020-06-05 17:13:17 gfx1012-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-06-05 17:13:21 gfx1012-0 77936867 OK      800   0.00%; 2982 us/it; ETA 2d 16:34; 1579c241dc63eca6 (check 1.27s)
2020-06-05 17:23:18 gfx1012-0 77936867 OK   200000   0.26%; 2991 us/it; ETA 2d 16:35; f0b04b45b0855bd2 (check 1.28s)
2020-06-05 17:33:15 gfx1012-0 77936867 OK   400000   0.51%; 2979 us/it; ETA 2d 16:10; c03f94396a5aa29e (check 1.27s)
2020-06-05 17:43:17 gfx1012-0 77936867 OK   600000   0.77%; 3004 us/it; ETA 2d 16:32; b9decd65ca71b629 (check 1.28s)
PS - Linux xii 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Attached Files
File Type: txt clinfo.txt (6.0 KB, 0 views)
Xyzzy is offline   Reply With Quote
Old 2020-06-05, 22:51   #2272
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3·1,283 Posts
Default Some gpuowl Radeon VII tuning data from the trenches

These are not final tunes, just what gives tolerable error rates for now. The rolloff in GhzD/day at higher fft lengths was surprisingly high at nearly 2:1. The peak I've achieved here at 5M is noticeably lower than what George posted (510 GhzD/day back in March with an earlier slower version on Linux.)
Attached Files
File Type: pdf gpuowl throughput.pdf (11.6 KB, 3 views)
kriesel is online now   Reply With Quote
Old 2020-06-05, 22:58   #2273
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1111000010012 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
As far as performance, how does this look?
Code:
2020-06-05 17:13:13 gpuowl v6.11-312-gc69350e-dirty
2020-06-05 17:13:13 Note: not found 'config.txt'
2020-06-05 17:13:13 device 0, unique id ''
2020-06-05 17:13:13 gfx1012-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-06-05 17:13:13 gfx1012-0 Expected maximum carry32: 583B0000
2020-06-05 17:13:13 gfx1012-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DWEIGHT_STEP=0x1.5621686be7602p+0 -DIWEIGHT_STEP=0x1.7f1af377e822p-1 -DWEIGHT_BIGSTEP=0x1.306fe0a31b715p+0 -DIWEIGHT_BIGSTEP=0x1.ae89f995ad3adp-1 -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=1u  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-06-05 17:13:16 gfx1012-0 OpenCL compilation in 3.10 s
2020-06-05 17:13:17 gfx1012-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-06-05 17:13:21 gfx1012-0 77936867 OK      800   0.00%; 2982 us/it; ETA 2d 16:34; 1579c241dc63eca6 (check 1.27s)
2020-06-05 17:23:18 gfx1012-0 77936867 OK   200000   0.26%; 2991 us/it; ETA 2d 16:35; f0b04b45b0855bd2 (check 1.28s)
2020-06-05 17:33:15 gfx1012-0 77936867 OK   400000   0.51%; 2979 us/it; ETA 2d 16:10; c03f94396a5aa29e (check 1.27s)
2020-06-05 17:43:17 gfx1012-0 77936867 OK   600000   0.77%; 3004 us/it; ETA 2d 16:32; b9decd65ca71b629 (check 1.28s)
That's a little slower than my RX480 and gpuowl-winv 6.11-292 runs 4.5M fft (at 2808us/it) on Windows 7 in a cramped warm HP Z600 workstation tower. Unfortunately https://www.mersenne.ca/cudalucas.php doesn't know anything about a W5500. Maybe you could run and send James a benchmark.
kriesel is online now   Reply With Quote
Old 2020-06-05, 23:53   #2274
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

384910 Posts
Default gpuowl-win v6.11-295 build (may be the last for a while)

Quote:
Originally Posted by kriesel View Post
Same ambiguous overload error as -310 and -311.
I would try an earlier commit but don't know the proper magic git incantation.
Using git bisect iteratively, per https://git-scm.com/docs/git-bisect it appears the last gpuowl commit without the "ambiguous overload" fatal build issue on MSYS2 for Windows is v6.11-295-gaecf041, v6.11-296-g33e2d8e bad.

$ git bisect good v6.11-295-gaecf041
Code:
33e2d8ef73d81c581fc0d0aa161445ddefb03c18 is the first bad commit
commit 33e2d8ef73d81c581fc0d0aa161445ddefb03c18
Author: Mihai Preda <mhpreda@gmail.com>
Date:   Mon May 25 23:39:37 2020 +1000

    In work: proof construction blueprint

 Args.cpp    |  6 +----
 GmpUtil.cpp |  2 +-
 GmpUtil.h   |  2 +-
 Gpu.cpp     | 79 +++++++++++++++++++++++++++++++++++----------------------
 Gpu.h       |  6 +++--
 ProofSet.h  | 84 ++++++++++++++++++++++++++++++++++++++++++++++---------------
 Task.cpp    |  8 +++++-
 main.cpp    |  1 +
 8 files changed, 128 insertions(+), 60 deletions(-)
Attached Files
File Type: txt build-log.txt (7.7 KB, 0 views)
File Type: 7z gpuowl-v6.11-295-gaecf041.7z (541.2 KB, 0 views)
kriesel is online now   Reply With Quote
Old 2020-06-06, 00:00   #2275
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

751810 Posts
Default

FWIW, here is how we got gpuowl working on a clean install of Centos 8.

As root:
Code:
yum update
yum install gmp-devel.x86_64
cd ~
wget https://drivers.amd.com/drivers/linux/amdgpu-pro-19.50-1011208-rhel-8.1.tar.xz
tar Jxvf amdgpu-pro-19.50-1011208-rhel-8.1.tar.xz
cd amdgpu-pro-19.50-1011208-rhel-8.1/
./amdgpu-pro-install -y --opencl=pal,legacy
reboot
As a normal user:
Code:
cd ~
git clone https://github.com/preda/gpuowl
cd gpuowl
<<< fix makefile >>>
make
We will test Centos 7 later tonight.

Xyzzy is offline   Reply With Quote
Old 2020-06-06, 00:16   #2276
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3×1,283 Posts
Default gpuowl-win v6.11-313 try

No joy there either. Thanks for trying.
Attached Files
File Type: txt build-log.txt (9.4 KB, 1 views)

Last fiddled with by kriesel on 2020-06-06 at 00:16
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1616 2020-05-31 16:46
GPUOWL AMD Windows OpenCL issues xx005fs GPU Computing 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 01:59.

Sat Jun 6 01:59:13 UTC 2020 up 72 days, 23:32, 0 users, load averages: 1.66, 1.37, 1.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.