![]() |
|
|
#45 | |
|
"Composite as Heck"
Oct 2017
16668 Posts |
I tried the washer mod with minimal success, YMMV: https://www.mersenneforum.org/showpo...9&postcount=13
edit: Quote:
Last fiddled with by M344587487 on 2020-01-27 at 22:27 |
|
|
|
|
|
|
#46 | |
|
∂2ω=0
Sep 2002
República de California
22·2,939 Posts |
Quote:
|
|
|
|
|
|
|
#47 |
|
∂2ω=0
Sep 2002
República de California
22·2,939 Posts |
After a fresh Ubuntu 19.10 install on my ~6-year-old Haswell system and several afternoons' work, including some awkward Dremel hackery of both the R7 mounting bracket and the back of my ATX case in order to resolve a geometric mismatch there, the R7 is in and recognized by the OS, lspci shows 2 R7 entries:
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 [Radeon VII] (rev c1) 03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Vega 20 HDMI Audio [Radeon VII] In terms of needed drivers, Matt (a.k.a. M344587487) noted this: "Ubuntu 19.10 uses kernel 5.3 which means the open source AMD driver that's built into the kernel can handle the Vega 20. If you were on an earlier kernel you'd need to install the amdgpu-pro driver from AMD's site but you should be good. Something you might need is the Vega 20 firmware, there was a strange period where the kernel had the right drivers but some distro's hadn't caught up to providing Vega 20 firmware. To check if you have the firmware open a terminal and run 'ls /lib/firmware/amdgpu/vega20*'." That latter list command shows 13 vega20_*.bin files, so that seems set to go. But - and I was clued in to the pronlem by my usual Mlucas 4-thread job on the Haswell CPU running 3x slower than usual - there is some kind of misconfiguration/driver problem remaining. 'top' shows multiple cycle-eating 'system-udevd' and 'modprobe' processes. Invoking 'dmesg' shows what appears to be the problem - endless repeats of this message: NVRM: No NVIDIA graphics adapter found! nvidia-nvlink: Unregistered the Nvlink Core, major device number 238 nvidia-nvlink: Nvlink Core is being initialized, major device number 238 It's not clear to me which of the following 3 possible causes is the likely culprit: 1. Preparing to instal the R7, I first removed an old nvidia gtx430 card from the PCI 2.0 slot (seems unlikely, because I quickly found the issue with the R7 mounting bracket after that, at which point I rebooted sans any gfx card, and had been running happily for several days like that). 2. The R7 needs some nVidia drivers and is not finding them; 3. The system is detecting *a* new video card - brand not important - and doing something nVidia-ish as a result. Last fiddled with by ewmayer on 2020-01-31 at 21:13 |
|
|
|
|
|
#48 |
|
Sep 2002
Database er0rr
5×937 Posts |
Maybe it is just easier to backup, reinstall afresh and, like most of us do, use RocM drivers.
I will be interested how the R7 performs on a PCIE-2 rather than a PCIE-3... Last fiddled with by paulunderwood on 2020-01-31 at 23:08 |
|
|
|
|
|
#49 |
|
∂2ω=0
Sep 2002
República de California
22×2,939 Posts |
My old gtx430 was on the PCIE-2 slot ... the R7 is on the PCIE-3, plus used both the 8-pin power connectors on the PSU in this system. It also needed me to use my Dremel with a small cutting wheel to chop out the metal bridge between the 2 back-of-case PCI cutout used by the R7. Here the gory post-surgery picture of the patient's innards:
Last fiddled with by ewmayer on 2020-01-31 at 23:04 |
|
|
|
|
|
#50 |
|
∂2ω=0
Sep 2002
República de California
267548 Posts |
Re. the nVidia-related dmesg errors in post #47, one additional possibility occurs to me ... the only nVidia drivers I ever explicitly installed were under the old headless Debian setup, which I blew away.
I removed the nVidia card a week ago, in prep. for trying to install the R7. However, the nVidia card was still installed when I upgraded to Ubuntu 19.10 ... might the Ubuntu installer have auto-detected the nVidia card and installed/defaulted-to-use the appropriate drivers at that point, and now the kernel is throwing errors due to the mismatch between those initial-OS-install drivers and the new gfx card? Last fiddled with by ewmayer on 2020-02-01 at 02:33 |
|
|
|
|
|
#51 | ||
|
"Composite as Heck"
Oct 2017
2×52×19 Posts |
Quote:
Quote:
The easiest/safest fix is probably to wipe and restart (after burning the nvidia card and burying it in a deep pit preferably, YMMV), but it can't hurt to try purging nvidia from the system if you feel like it (it's not critical but highly recommended that you change your wallpaper to Linus flipping off nvidia at this point, for luck). This is from an old guide but it seems reasonable: This command should list all nvidia packages, there should be a few dozen of them: Code:
dpkg -l | grep -i nvidia Code:
sudo apt-get remove --purge '^nvidia-.*' Code:
sudo apt-get install ubuntu-desktop |
||
|
|
|
|
|
#52 | |
|
∂2ω=0
Sep 2002
República de California
22×2,939 Posts |
Quote:
[ 2.517924] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/vega20_ta.bin failed with error -2 I see 13 files among the /lib/firmware/amdgpu/vega20*.bin set which Ubuntu 19.10 auto-installed, but no vega20_ta.bin among those, probably just need to grab that one separately. Most importantly, 'top' no longer shows any out-of-control system processes, and my Mlucas runs on the CPU are once again back at normal throughput. So, progress! Last fiddled with by ewmayer on 2020-02-01 at 21:28 |
|
|
|
|
|
|
#53 |
|
∂2ω=0
Sep 2002
República de California
101101111011002 Posts |
Now that Super Bowl Sunday (quasi-holiday in the US revolving around the National Footbal League championship game) is behind us, an update - the card seems to be functioning properly. I've been following Matt's "quick and dirty setup guide" here, am currently at the "Take the above [bash] init script [to set up for 2-gpuwol-instance running] and tweak it to suit your card". First I'd like to play with some basic single-instance running, but something is borked. The readme says "Self-test: simply start gpuowl with any valid exponent..." but does not say how to specify that expo via cmd-line flags. I tried just sticking a prime expo in there, then without any arguments whatever, both gave the following kind of error:
Code:
ewmayer@ewmayer-haswell:~/gpuowl$ ./gpuowl 90110269 2020-02-01 18:43:36 gpuowl v6.11-142-gf54af2e 2020-02-01 18:43:36 Note: not found 'config.txt' 2020-02-01 18:43:36 config: 90110269 2020-02-01 18:43:36 device 0, unique id '' 2020-02-01 18:43:36 Exception gpu_error: DEVICE_NOT_FOUND clGetDeviceIDs(platforms[i], kind, 64, devices, &n) at clwrap.cpp:77 getDeviceIDs 2020-02-01 18:43:36 Bye ewmayer@ewmayer-haswell:~/gpuowl$ ./gpuowl 2020-02-01 18:44:02 gpuowl v6.11-142-gf54af2e 2020-02-01 18:44:02 Note: not found 'config.txt' 2020-02-01 18:44:02 device 0, unique id '' 2020-02-01 18:44:02 Exception gpu_error: DEVICE_NOT_FOUND clGetDeviceIDs(platforms[i], kind, 64, devices, &n) at clwrap.cpp:77 getDeviceIDs 2020-02-01 18:44:02 Bye Looking ahead, the first 2 steps of the setup-for-2-instances script are these: Code:
#Allow manual control echo "manual" >/sys/class/drm/card0/device/power_dpm_force_performance_level #Undervolt by setting max voltage # V Set this to 50mV less than the max stock voltage of your card (which varies from card to card), then optionally tune it down echo "vc 2 1801 1010" >/sys/class/drm/card0/device/pp_od_clk_voltage Code:
GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 1 31.0c 21.0W 809Mhz 351Mhz 21.96% auto 250.0W 0% 0% Thanks for any help from current gpuowl users. Last fiddled with by ewmayer on 2020-02-03 at 20:16 |
|
|
|
|
|
#54 | |
|
Sep 2002
Database er0rr
111158 Posts |
Quote:
Start with fans at 170; monitor the temperatures and, depending on your overclock, undervolt and ambient temperature, you might be able to reduce the fan speed. Last fiddled with by paulunderwood on 2020-02-03 at 22:57 |
|
|
|
|
|
|
#55 | |
|
∂2ω=0
Sep 2002
República de California
22×2,939 Posts |
Quote:
Per the readme, single minus sign ... from within a subdir 'run0' where I have created a worktodo.txt file containing a pair of PRP assignments, I tried 'sudo ../gpuowl -user ewmayer' ... after entering my sudo password the run echoed same as the 2nd #fail above, just with an added 'config: -user ewmayer' line. Trying to instead login as root and run that way [this the Ubuntu 19.10 setup I created last week] and using the same pwd gives 'Authentication failure'. I don't recall entering any other pwd during the set-pwd phase of Ubuntu 19.10 setup. Not needed yet since I can't run at all, but how do determine the max stock voltage of my R7? |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| AMD Radeon Pro WX 3200 | ET_ | GPU Computing | 1 | 2019-07-04 11:02 |
| Radeon Pro Vega II Duo (look at this monster) | M344587487 | GPU Computing | 10 | 2019-06-18 14:00 |
| What's the best project to run on a Radeon RX 480? | jasong | GPU Computing | 0 | 2016-11-09 04:32 |
| Radeon Pro Duo | 0PolarBearsHere | GPU Computing | 0 | 2016-03-15 01:32 |
| AMD Radeon R9 295X2 | firejuggler | GPU Computing | 33 | 2014-09-03 21:42 |