mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2020-02-06, 20:36   #89
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22×2,939 Posts
Default

Quote:
Originally Posted by preda View Post
I don't have much experience with setting the mem frequency with rocm-smi, I was not aware of --setmlevel. In particular, when overclocking the mem, I was setting only the frequency but not the voltage. (I don't know if the mem voltage is different form the "sclk" voltage, and if so how to read the current mem voltage)

Anyway, maybe you could try something like:
--setmlevel 1 1150
and see if that has an effect on performance (expected: increase in perf) and on power (expected: small increase in power).
Thanks - I tried a couple different things:

[1] --setmclk: there is no flag to show the valid range, so I just started with a deliberately outrageous value 100, and the resulting error message said "Max clock level is 2". Current (default) mclk level is unknown, but it yields a frequency of 1001MHz. --setmclk 2 leaves that unchanged, so that appears to be the default. --setmclk 1 knocks that down to 801MHz - the opposite direction of where we want to go - cutting ~10W from the power usage, and rasing iteration times @5632K from 790us to ~840us. I passed on trying level 0 and instead reverted to level 2. :)

[2] --setmlevel: your suggestion of args '1 1150' failed with "expected 3 argument(s)". So we need the third MVOLT arg to be supplied. --showvoltagerange indicates a valid voltage range of 738-1218mV, so next I tried --setmlevel 1 1150 1000. After answering 'y' to the resulting scary SMI warning about operating outside of official AMD specs, got this:
Code:
Unable to write to sysfs file /sys/class/drm/card1/device/pp_od_clk_voltage
That probably just means I need to sudo the command, but that filename sounded familiar, it appears in Matt's setup-to run-2-instances script. So let's see what the current values in the file are:
Code:
OD_SCLK:
0:        808Mhz
1:       1801Mhz
OD_MCLK:
1:       1150Mhz
OD_VDDC_CURVE:
0:        808Mhz        711mV
1:       1304Mhz        803mV
2:       1801Mhz       1096mV
OD_RANGE:
SCLK:     808Mhz       2200Mhz
MCLK:     801Mhz       1200Mhz
VDDC_CURVE_SCLK[0]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[0]:     738mV        1218mV
VDDC_CURVE_SCLK[1]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[1]:     738mV        1218mV
VDDC_CURVE_SCLK[2]:     808Mhz       2200Mhz
VDDC_CURVE_VOLT[2]:     738mV        1218mV
Matt's script:
Code:
#!/bin/bash

if [ "$EUID" -ne 0 ]; then echo "Radeon VII init script needs to be executed as root" && exit; fi

#Allow manual control
echo "manual" >/sys/class/drm/card0/device/power_dpm_force_performance_level
#Undervolt by setting max voltage
#               V Set this to 50mV less than the max stock voltage of your card (which varies from card to card), then optionally tune it down
echo "vc 2 1801 1010" >/sys/class/drm/card0/device/pp_od_clk_voltage
#Overclock mclk to 1200
echo "m 1 1200" >/sys/class/drm/card0/device/pp_od_clk_voltage
#Push a dummy sclk change for the undervolt to stick
echo "s 1 1801" >/sys/class/drm/card0/device/pp_od_clk_voltage
#Push everything to the card
echo "c" >/sys/class/drm/card0/device/pp_od_clk_voltage
#Put card into desired performance level
/opt/rocm/bin/rocm-smi --setsclk 4 --setfan 110
So that 'vc 2 1801 1010' line appears to correspond to the level-2 entry in the above file:
Code:
OD_VDDC_CURVE:
0:        808Mhz        711mV
1:       1304Mhz        803mV
2:       1801Mhz       1096mV
I'm guessing that Matt's "Set this to 50mV less than the max stock voltage of your card (which varies from card to card)" with arrow pointing down at the 1010 entry means his card has a max stock voltage of 1060mV, whereas mine has 1096mV. But better safe than sorry for starters, I kept his script as-is and used value 1010. But even executing the script as root I get permission errors:
Code:
root@ewmayer-haswell:/home/ewmayer# ./radeon_setup.sh
./radeon_setup.sh: line 6: /sys/class/drm/card0/device/power_dpm_force_performance_level: Permission denied
./radeon_setup.sh: line 9: /sys/class/drm/card0/device/pp_od_clk_voltage: Permission denied
./radeon_setup.sh: line 11: /sys/class/drm/card0/device/pp_od_clk_voltage: Permission denied
./radeon_setup.sh: line 13: /sys/class/drm/card0/device/pp_od_clk_voltage: Permission denied
./radeon_setup.sh: line 15: /sys/class/drm/card0/device/pp_od_clk_voltage: Permission denied

========================ROCm System Management Interface========================
GPU[1] 		: Successfully set sclk frequency mask to Level 4
GPU[1] 		: Successfully set fan speed to Level 110
==============================End of ROCm SMI Log ==============================
Per-iter times of my GpuOwl run unchanged, only change I can see is that the 'Perf' entry in the rocm-smi output now reads 'manual'.
ewmayer is offline   Reply With Quote
Old 2020-02-06, 20:59   #90
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

267548 Posts
Default

p.s.: The above write errors in the script-exec appear to be another manifestation of the "Unable to write to sysfs file /sys/class/drm/card1/device/pp_od_clk_voltage" error I got previously, because when I next try '--setmlevel 2 1801 1150' I get the above file-write error followed by a "Unable to set mclk clock to Level m 2 1801 1150".

The permissions on said file are '-rw-r--r-- 1 root root' ... so why can't root write it?
ewmayer is offline   Reply With Quote
Old 2020-02-06, 23:16   #91
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×52×19 Posts
Default

That's the gist of it, reading pp_od_clk_voltage gives a human-readable table of the current state and passing it different command strings can alter the state. Passing "m 1 1200" to pp_od_clk_voltage is saying set the memory clock of state 1 to 1200 MHz, which is the state we're normally in. Similarly pushing "vc 2 1801 1010" sets the max voltage on the voltage curve to 1010, which you're right was the voltage I was using to undervolt. The way I set voltage in that script was a hack because we're setting the max voltage for running at --setsclk 7 (corresponding to 1801 MHz) but actually running at --setsclk 3/4, not setting the voltage we're using directly and instead letting the voltage curve black box set the voltage to somewhere between states 1 and 2. It was done this way because at the time (and perhaps still) it was finicky as hell so I did the minimum alterations required to get it working.

--setmlevel wasn't a thing when the script was made so I never tried it, it does look like it's just an interface to setting mclk and mem voltage. Never touched memory voltage and I never want to, unless going way out of spec I don't think it's an issue. Memory voltage may not even apply to Vega20, IIRC with Vega10 the memory voltage is shared with one of the clock voltages so changing memory voltage directly shouldn't work and vega20 may have inherited that trait.

As for being unable to write to pp_od_clk_voltage let's flail about wildly. the first thing I'd try is the scorched earth approach to file permissions:
Code:
sudo chmod 777 /sys/class/drm/card0/device/pp_od_clk_voltage
It probably won't work because pp_od_clk_voltage is not a normal file but try it anyway. Next try running with sudo instead of as actual root, shouldn't make a difference but it's an easy thing to rule out, there may be some group funkiness going on. The next thing is to try and put root in the video group if it isn't already if you even can, it probably won't work as I think root should by default act like it's in all groups but it's worth a shot and we are flailing about so it's appropriate to try all angles.
M344587487 is offline   Reply With Quote
Old 2020-02-06, 23:31   #92
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22·2,939 Posts
Default

@Matt:

o sudo allowed the 777-chmod, but --setmlevel still fails the same way, whether run with sudo or as root.

o Re-doing the chmod as root, again same --setmlevel failure.

o Contents of /etc/udev/rules.d/70-kfd.rules are the stuff from your setup guide:

SUBSYSTEM=="kfd", KERNEL=="kfd", TAG+="uaccess", GROUP="video"

I don't see any kind of 'user' field in there - that file is owned by root, so presumably root is in the video group?

The error message specifically mentions the file as being a 'sysfs file', perhaps its status as a system file contains a clue?

Last fiddled with by ewmayer on 2020-02-06 at 23:38
ewmayer is offline   Reply With Quote
Old 2020-02-07, 22:15   #93
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

22·2,939 Posts
Default

Quote:
Originally Posted by ewmayer View Post
The error message specifically mentions the file as being a 'sysfs file', perhaps its status as a system file contains a clue?
OK, found some info about this issue here. The gksu package no longer exists in Ubuntu, so I tried the 'sudo -H gedit' method to open the file. In the '2' entry under OD_VDDC_CURVE, manually changed 1096mV to 1150mV, then attempted to save, got this error message:

Could not save the file "/sys/class/drm/card0/device/pp_od_clk_voltage".
Unexpected error: Error writing to file: Invalid argument

Clicking the X-checkbox to close the dialog, got a further error dialog:

The file "/sys/class/drm/card0/device/pp_od_clk_voltage" changed on disk.

After clicking "drop changes and reload", I notice the file timestamp immediately updates to the current time. Could it be that the software which runs the R7 has grabbed the file and is continually accessing it in some way which precludes root from editing it?
ewmayer is offline   Reply With Quote
Old 2020-02-08, 01:39   #94
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×52×19 Posts
Default

  • gksu and gksudo are like su and sudo but with a graphical interface and might do something special when running GUI programs, I don't know the specifics but lack of gksu shouldn't be the issue. You can use nano to edit simple config files from the terminal if gedit is problematic
  • You can't edit sysfs files like that, you tend to be able to read some normally and you tend to write to some by piping commands like in the script. The kernel is maintaining the state of the virtual file and in all likelihood it's working as intended


Unfortunately I don't really know where to go from here to debug your issue. Did you try the last point from my previous post of adding root to the video group?
Code:
usermod -a -G video root
M344587487 is offline   Reply With Quote
Old 2020-02-08, 03:49   #95
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2DEC16 Posts
Default

Quote:
Originally Posted by M344587487 View Post
Did you try the last point from my previous post of adding root to the video group?
Code:
usermod  -a -G video root
I had a followup about that in the post below it, so no, did not yet do the add-to-group. Just tried it, first as sudo, then as root, both failed with the the same unable-to-write-file error as before. So looks like I'm stuck with the default mem-voltage. Ah, well - thanks for all the suggestions!

=======================

Next up: Now that the R7 is up and blasting, I'm just gonna let my 2 ongoing CPU/Mlucas jobs finish and then let the CPU idle, crunching-wise - no point having the CPU burning similar watts as the GPU, for less than 1/10th the throughput.

That will free up ~100 watts, so the obvious question is, what is the best bang-for-$ GPU which I could plug into the PCI2 slot? Note I'm all out of 8-pin connectors on my current PSU's wire bundle, though there may be a couple 6-pin ones remaining which I could gang together to create a single 8-pin one. Are there any decently fast GPUs which can get all their needed power through the PCI bus? And if there are some nVidia ones which qualify, can one mix the 2 card types in a single system?
ewmayer is offline   Reply With Quote
Old 2020-02-08, 07:36   #96
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24·3·163 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Now that the R7 is up and blasting, I'm just gonna let my 2 ongoing CPU/Mlucas jobs finish and then let the CPU idle, crunching-wise - no point having the CPU burning similar watts as the GPU, for less than 1/10th the throughput.

That will free up ~100 watts, so the obvious question is, what is the best bang-for-$ GPU which I could plug into the PCI2 slot? Note I'm all out of 8-pin connectors on my current PSU's wire bundle, though there may be a couple 6-pin ones remaining which I could gang together to create a single 8-pin one. Are there any decently fast GPUs which can get all their needed power through the PCI bus? And if there are some nVidia ones which qualify, can one mix the 2 card types in a single system?
The GTX 1650 has 75 watt rating so can be driven by the PCIe slot only. It also has an excellent GhzD/day / watt TF rating just over 12. To get above that, costs $500 to $9,000. https://www.mersenne.ca/mfaktc.php?sort=gpw
A Radeon RX470 can get fed by a SATA power to 6-pin adapter; nominally 110 W. It's also decent for PRP GhzD/d/watt for the price range. https://www.mersenne.ca/cudalucas.php?sort=gpw Half the TF speed of the GTX1650 but 150% of the PRP speed. And should get along well with the Radeon VII.
It's possible to run a mixed system. In fact it's required to use all the slots of the bigger mining rig boards, due to limits on how many gpus a driver can support. I had 3 NVIDIA coexisting with an RX550 for a while on WIn7. Installing a new driver for an RTX resulted in no OpenCL for NVIDIA, so no gpuowl for NVIDIA there, and no function for lower compute capability 2.x models either. The Intel HD4600 IGP never did hold indications of a functioning OpenCL capability long enough to be useful for mfakto on that install. https://www.anandtech.com/show/11739...pansions-slots
Attached Thumbnails
Click image for larger version

Name:	mixed gpu system.png
Views:	127
Size:	56.1 KB
ID:	21743  
kriesel is online now   Reply With Quote
Old 2020-02-08, 19:54   #97
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

267548 Posts
Default

@Ken: Thanks, I will look into those!

Edit: OK, a couple of notes. First off, note that have little interest in TF work, though I recognize how important it is for the project to have plenty of folks who enjoy that kid of crunching.

o The R7 numbers at the top of the mersenne.ca GPU-for-LL/PRP page appear to be out of date: they have R7 costing $700, using 300W and getting ~280 GHz-days per calendar day. (The table only goes up to p ~95M, but the numbers are pretty flat). My R7 cost $550, and with a slight downclocking tweak to bring its FLOP/W number into the sweet spot for the card, burns ~220W. It completes one assignment with p ~ 103M, worth 431GHz-day (based on a just submitted one, 15 GHz-day for the p-1 step with no factor found, 416 GHz-day for the PRP) every 23 hours, giving a daily output of 450 GHz-day, which is 60% higher than that listed at the above page, and a per-watt output of ~2.05, more than 2x the 0.972 figure at the above page.

o For the 4-core Haswell CPU on the same system, running Mlucas (suboptimal, ~2/3 the throughput of Prime95, but my need for continual QA testing of my own code trumps the quest for optimality), currently finishing some p ~96M assignments (credit of ~360 GHz-day), 1 every 16 days, at almost exactly 100W, thus a modest ~23 GHd/d and 0.23 GHd/d/W. Again, if all I cared about was total throughput I'd be running mprime and getting closer to 0.4 GHd/d/W. The sub-$500 cards I see on the above page top out at about the same ~0.4 GHd/d/W. OTOH Mike/Xyzzy suggest a used GTX-1050 might set me back a mere $100 or so, with that I'm looking at similar GHd/d and GHd/d/W as running my own code on the CPU. (Unless the tabulated throughput numbers for the 1050 are understatements like those for the R7).

Last fiddled with by ewmayer on 2020-02-08 at 23:14
ewmayer is offline   Reply With Quote
Old 2020-02-09, 02:25   #98
axn
 
axn's Avatar
 
Jun 2003

23·683 Posts
Default

Quote:
Originally Posted by ewmayer View Post
(Unless the tabulated throughput numbers for the 1050 are understatements like those for the R7).
The tabulated numbers are probably understatements for most cards. The power usage is based on the rated values and not actuals; for example, my 1660 Ti uses about 85w when running cudaLucas, but the calculation uses the rated 120w TDP. Also, the performance figures are a slight underestimate as well.
And these are the problems with just the out-of-the-box figures. If you downclock the cards to be more power efficient, forget about it.
axn is offline   Reply With Quote
Old 2020-02-09, 02:32   #99
axn
 
axn's Avatar
 
Jun 2003

155816 Posts
Default

Quote:
Originally Posted by ewmayer View Post
OTOH Mike/Xyzzy suggest a used GTX-1050 might set me back a mere $100 or so
A brand new 1650 can be had for $150, and will have higher thruput and better power efficiency than a 1050. And should you ever want to do TF, it will have 4x the thruput!
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
AMD Radeon Pro WX 3200 ET_ GPU Computing 1 2019-07-04 11:02
Radeon Pro Vega II Duo (look at this monster) M344587487 GPU Computing 10 2019-06-18 14:00
What's the best project to run on a Radeon RX 480? jasong GPU Computing 0 2016-11-09 04:32
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 15:38.


Fri Jul 7 15:38:04 UTC 2023 up 323 days, 13:06, 0 users, load averages: 1.63, 1.21, 1.11

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔