![]() |
[QUOTE=ewmayer;545399]I don't need to do that on my Haswell+R7 system ... but on this new build, despite the gpuowl executable showing regular-user x permission, that seems to do the trick, back up and running. Thanks!
Still not understanding why clinfo would show 0 devices post-boot, find none when attempting to run the program as regular user, then suddenly find the card when run vai sudo ... is gpuowl initing the card now, when run via sudo?[/QUOTE] It's ugly, but [URL="https://mersenneforum.org/showthread.php?t=25065&highlight=sudo&page=2"]see here[/URL] on how to run without sudo. |
Did you add yourself to the video group? This solves the run-as-root problem.
|
[QUOTE=paulunderwood;545401]It's ugly, but [URL="https://mersenneforum.org/showthread.php?t=25065&highlight=sudo&page=2"]see here[/URL] on how to run without sudo.[/QUOTE]
Well, I run everything and do all my work logged in as [I]root[/I]. :loco: I know, go ahead and roast me if you want, but I've been doing it that way for 40 years now and can't remember a time where I wished I hadn't been logged in as root. :weirdo: |
[QUOTE=PhilF;545403]Well, I run everything and do all my work logged in as [I]root[/I]. :loco:
I know, go ahead and roast me if you want, but I've been doing it that way for 40 years now and can't remember a time where I wished I hadn't been logged in as root. :weirdo:[/QUOTE] :davar55: I run gpuowl as root on Buster; Everything else as a user, apart from admin stuff :wink: |
[QUOTE=Prime95;545402]Did you add yourself to the video group? This solves the run-as-root problem.[/QUOTE]
I did same setup (as best I can recall) as on my Haswell system, including this step: [i] echo 'SUBSYSTEM=="kfd", KERNEL=="kfd", TAG+="uaccess", GROUP="video"' | sudo tee /etc/udev/rules.d/70-kfd.rules [/i] Has that changed under ROCm 3.3, or was there always a separate step which I somehow omitted capturing to the setup recipe I crafted from my first such go-round? Not a biggie in any event - no objction to prefixing sudo to run command if that's what this setup needs. |
[QUOTE=ewmayer;545405]I did same setup (as best I can recall) as on my Haswell system, including this step:
[i] echo 'SUBSYSTEM=="kfd", KERNEL=="kfd", TAG+="uaccess", GROUP="video"' | sudo tee /etc/udev/rules.d/70-kfd.rules [/i] Has that changed under ROCm 3.3, or was there always a separate step which I somehow omitted capturing to the setup recipe I crafted from my first such go-round? Not a biggie in any event - no objction to prefixing sudo to run command if that's what this setup needs.[/QUOTE] That does not add you to the group. I think usermod is the command |
[QUOTE=Prime95;545406]That does not add you to the group. I think usermod is the command[/QUOTE]
Does [c]id[/c] say you are in the video group? If not do [C]sudo usermod -aG video emayer[/C] (if that is your username) [U]and[/U] logout and back in and run [c]id[/c] again to see. |
[QUOTE=paulunderwood;545411]Does [c]id[/c] say you are in the video group? If not do [C]sudo usermod -aG video emayer[/C] (if that is your username) [U]and[/U] logout and back in and run [c]id[/c] again to see.[/QUOTE]
id shows [code]uid=1000(ewmayer) gid=1000(ewmayer) groups=1000(ewmayer),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),119(lpadmin),130(lxd),131(sambashare)[/code] Doing 'sudo usermod -aG video ewmayer' produces no change, so either the above includes the video group under some other rubric, or something else is going on. [b]Edit:[/b] paul PMed me to remind me to log out/in after doing the above, now we indeed see the desired group: [code]uid=1000(ewmayer) gid=1000(ewmayer) groups=1000(ewmayer),4(adm),24(cdrom),27(sudo),30(dip),[b]44(video)[/b],46(plugdev),119(lpadmin),130(lxd),131(sambashare)[/code] ...but still can't run w/o 'sudo' - ah well, not a big deal, just an odd difference from my Haswell system's R7 setup, which is still under ROCm 2.10. Other stuff ... so my file ownership in the gpuowl-dir shows this awkward mix of root and ewmayer: [code]ewmayer@ewmayer-gimp:~/gpuowl/run0$ ls -ltc total 752 -rw------- 1 ewmayer ewmayer 654272 May 15 11:56 nohup.out drwxr-xr-x 2 root root 4096 May 15 11:55 104954389 -rw-r--r-- 1 ewmayer ewmayer 82349 May 15 11:55 gpuowl.log -rw-r--r-- 1 ewmayer ewmayer 356 May 15 07:57 results.txt -rw-r--r-- 1 root root 531 May 15 07:57 worktodo.txt -rw-r--r-- 1 ewmayer ewmayer 590 May 15 07:57 worktodo.txt-bak drwxrwxr-x 2 ewmayer ewmayer 4096 May 15 07:53 104954387[/code] [Edit: used chown -R ewmayer:ewmayer * within each of my run-subdirs to uniformize ownership.] Wrestled R7 #2 into place yesterday afternoon, the design of the open-frame chassis I'm using makes this really, really awkward, with card #1 less than inch away and blocking physical access from that side. More awkwardness attaching the bracket supports. Gonna shut down and plug in power now, see if we have life on reboot ... and, we're live! Baseline system load is 40W. With both R7s set at sclk=4, each card running 2 gpuowl instances adds 240W ... thus with 2 cards I am at 520W. Once I add card #3 I will need to fiddle the expected 760W at the current settings down to the 600-650W range, so as to not overload my 850W power supply. EDIT: As noted above, the spacing between cards 1 and 2 is very tight, a mere 0.8" (20mm) gap ... at my normal fan=120 setting that causes card #1 to throttle, since that card's fan array is airflow-restricted by the narrow spacing: [code]GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0 87.0c 198.0W 1547Mhz 1001Mhz 57.65% manual 250.0W 6% 100% 1 70.0c 195.0W 1547Mhz 1001Mhz 57.65% manual 250.0W 2% 100%[/code] Even upping #1's fan to 150, temp is 83C and run timings show bigtime throttling ... so that card needs sclk=3, which cuts 40W and gives temp ~75C, same as card #2 with it unrestricted airflow running at sclk=4. With those sclk settings, each card #1 job is getting ~1430 us/iter @5.5M FFT, compared to 1350 us/iter for each of the pair of card #2 jobs. Adding card #3 should cause no such airflow issues since that is the one that will go into a right-angle PCI-slot adapter and thus be at 90 degrees offset relative to its neighbor #2 ... but test-fitting card #3 into the intended place shows that I will need a further regular [url=https://www.amazon.com/GODSHARK-PCI-Express-Extension-Protector-Adapter/dp/B07RWRK2L6/]straight PCI slot extender [/url] in order to avoid interference with the other components (including the power/reset switch bundle), just ordered that. (It's like Legos! :) Further, over the weekend I need to construct some kind of test-frame extender to physically support that card. Hopefully have that all done and some cute family photos for y'all come Monday or Tuesday. |
Extender bracket for GPU #3 ...almost there!
2 Attachment(s)
Spent several hours yesterday hacking (as in hacksawing and drilling and metal-bending) an aluminum bracket which I'll need to mount on the end of my new compute rig to physically support the last GPU that will host. Long-time forumites will enjoy this - the Alu. rod I used is one of the stiffeners from the soft full-leg casts I had to wear for 3 mos. after leaving Stanford Med Ctr back in late 2005, after getting creamed in a crosswalk by a red-light-running SUV. The casts got tossed but I figured the 2 side stiffeners from each might come in handy for the old home construction materials" cache. And so they have. The cross-section is a flat rectangle with rounded corners, and a semicircular-x-section vane running down the center of one of of the 2 long sides of the rectangle for added stiffening on the T-bar principle. That vane fits neatly into the side channels of the extruded Alu. frame elements of my testbench frame - metallic red in the pics. Those channels are for little square screwhole plates used for affixing various stuff to the frame, and they threw in extra plates and screws, so I used some of those to attach the hack-a-bracket. That first required me to solve a problem of interference with the bracket which holds the vertical metal rods (mostly hidden to left of the 2 Radeon GPUs in pic1) that attach to the right-angle side of the GPU mounting brackets - had to unscrew all that and slide the rod-holding bracket an inch further away from the corner of the frame, a major several-hour PITA as this part of the test frame is very poorly thought out, way too many moving parts and tiny screws, which are nonmagnetic to boot, meaning you can't simply stick 'em on the magnetic tip of your favorte Phillips-head screwdriver and have them stay in place as you delicately maneuver them into the tight spaces. I am seriously considering my own testframe design as a project for next year, it would be sturdier, 10x simpler to work with, and probably quite a bit cheaper to manufacture, under the hypothetical-mass-production assumption of labor and material costs being equivalent.
Once all that was done, plugged in the 2nd PCI-slot adapter - the first rises a further 3/4" up off the board and then has a right-angle bend, this one is just a straight 3/4" riser, needed to keep the GPU from interfering with the power-and-many-other-pinouts side of the mobo visible between the bracket and the closer of the 2 GPUs in pic1, just above the edge of the cardboard box I use to make the build easy to slide around on my workbench - which arrived just a few hours earlier, and test-maneuvered GPU #3 into place. That showed that one problem remains, I need to add some plastic vertical spacers to the extender-support bracket to raise that side of the GPU to a proper height. That - fingers crossed - should be relatively straightforward. Thus ends today's exciting episode of "Ernst's dream build" -- tune in again tomorrow for the thrilling (hopefully) conclusion! |
Update: And we are up an running on all 3 GPUs ... had to downclock all to sclk = 3 to keep the total wattage reasonable, with 6 gpuOwl instances am drawing ~675W, which seems reasonable, if at the upper end of reasonable, for 24/7 running on this 850W Corsair 80+ Gold power supply.
Each of the 6 jobs @5.5M FFT getting ~1430 us/iter, which means ~40 hrs per instance to do a 105M exponent. Including the first Radeon VII I bought for my Haswell ATX case system, will be doing ~5 such PRPs per day on average, though the temps on the new build are such that that will have be cut to sclk = 2 on warm days. Pretty pics will have to wait until I tidy up the spaghetti mess of cabling a bit. Fan noise is non-annoyingly loud (at least to my not-unbiased ears :) steady white-noise whoosh ... with GPU#3 adding around 5in to the length of the system footprint it's not exactly compact, around 3" longer than the full-length keyboard sitting next to it and a smidge less than a foot (30cm) wide, but it might actually fit on my desk if I invest in a suitable-size bridge to put it under (in proper troll-like fashion), with my monitor on top. Thanks to all for advice, support and patiently sitting through the past month's long build soap opera! I hope this thread will prove useful for others contemplating such a multi-GPU build. |
Congrats on getting you 3-GPU rig running. With the one on your Haswell that is equivalent to ~40x 4 core CPUs but far more efficient in power usage and overall size. Now you should soon be propelled into the top 10 testers and have a better chance of striking gold and a place in Mersenne prime history. Surely one will be found this year!
|
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.