 2021-02-12, 21:55 #23 ewmayer ∂2ω=0     Sep 2002 República de California 32·1,303 Posts I run 2 instances per card on each of my Radeon VIIs for 2 reasons: 1. Gives an total throughput boost in the 7-10% range; 2. If one job hangs or crashes - infrequent, but it does happen - one minimizes the total throughput hit. Even if one has a GPU model where 2-instances is slightly slower in total-throughput terms - say no more that 5% - [2] makes it worth doing, IMO. On the R7 I found negative benefit from > 2 instances.
OK, gotcha. Guess I'll test it out and if the throughput is same or slightly lower, I'll go with 2.

On GeForce GTX 970 getting worse results. 12.3 ms/iter when running one instance. When running 2 I get 28.6 average. That's about 16% worse throughput.

 2021-02-20, 09:16 #26 ZFR     Feb 2008 Meath, Ireland 3·61 Posts Actually, never mind. That was when I tested by putting the assigned exponent into worktodo, which does a P-1 on it simultaneously? When I simply test with the -prp arguments, the throughput on running two instances is only 2% worse. I'll just use 2 instances for the in-case-it-hangs-or-crashes reason.
Success report

Hi! I did a clean install of Ubuntu 20.04.2 LTS on a Gigabyte Aorus X570 I Pro Wifi with a AMD Ryzen 5950x CPU and a Radeon VII card. The instructions worked fine, except for diff below, and it is now churning its first PRP's (both gpuowl and mprime).

Please add 'make' here -- as it is not installed by default or anything else.

Thanks, Simon Last fiddled with by jas on 2021-05-25 at 20:54  2021-05-25, 21:46 #28 ewmayer ∂2ω=0 Sep 2002 República de California 32×1,303 Posts @Simon - done, thanks - I put the clinfo-issue fix in the Troubleshooting section of the OP. 16GB should be OK with just 1 instance running, but suggest watching progress through at least one p-1 stage 2 ('P2' in the progress-line just right of the exponent): If that takes more than a few hours, maybe ratchet things back to 14-15GB. (I always run 2 instances per R7, but perhaps other 1-instance runners can comment on whether maxAlloc of 16GB works for their p-1 stage 2s). Last fiddled with by ewmayer on 2021-05-25 at 21:47 2021-05-26, 09:25 #29 M344587487 "Composite as Heck" Oct 2017 19×47 Posts Quote:  Originally Posted by jas ... Please add 'make' here -- as it is not installed by default or anything else. Might be best to replace gcc with build-essential instead, it's a meta package that installs gcc/g++/make and probably a few other parts commonly used in a standard libc toolchain. BTW make not installed by default is criminal, should be punishable by a week of having to exclusively use Hannah Montana Linux :P 2021-05-26, 21:11 #30 ewmayer 2ω=0 Sep 2002 República de California 1172710 Posts Quote:  Originally Posted by M344587487 Might be best to replace gcc with build-essential instead, it's a meta package that installs gcc/g++/make and probably a few other parts commonly used in a standard libc toolchain. Good suggestion - done.  2021-05-27, 03:57 #31 LaurV Romulan Interpreter "name field" Jun 2011 Thailand 2×17×293 Posts [offtopic] Or, not really off topic, more like a praise to your tutorial, but from a different angle... I have a couple of Radeon VII cards that I wanted to use for mining, but they got a ridiculously low hash rate under win7, i.e. half of what I read on the dedicated forums they should achieve. Struggled with windoze drivers for a while, went as much back as installing drivers from 2015, and wasted a week for it (in the evenings, after working time, come home, eat, take a shower, 4-5 hours in front of the computer every evening!, even one or two sleepless nights), but to no result. With this or that (old) driver I got more or less hash rate, but no cigar. As I didn't like the "dedicated for mining" operating systems (like HiveOS or so, you never know where your profit goes...) I decided to run Ubuntu from a stick. I disconnected all my disks (afraid of bugs in the mining software, I won't install that on a computer where I do the banking and internet shopping stuff), ran the 'buntu, from a stick, got some courage after dealing with colab and puting my nose into other people's scripts (Teal, Dan, Chris, DanaJ, etc, thank you all!), then I put Ubuntu on a 64G external SSD (thing costs$8 on Lazada and it is bloody fast!), then wasted few more days (!) - no joke - dealing with installing AMD drivers and trying to convince OpenCL to run on my mini-buntu. That is a f'king laborious stuff, did you see the tutorials on the web? (rhetoric question, no need answer). Grrr... I learned more Linux with this task than I learned in the last 35 years. I mean, we had a 2 or 3 semesters Linux course in the last years of the uni, but that was in the very incipient phase of it, we (collective we) were thinking more how to convince the female colleagues to go on dates with us than to write bash files, that was 35 years ago, and I never used it seriously ever since, except for small stuff now and then. In conclusion, I suck at Linux. However. I was almost ready to give up with making OpenCL run on it, I followed different tutorials, etc., I even got to the stage that I was running the clinfo from a directory and all cards were properly identified, but when I was running the same clinfo from another folder there was no graphic card installed. Which pissed me off terrible. But then I said, ok, if I am in this stage already, let's see if I can make gpuOwl running, to see at least if the Linux version is faster than the Windows version. Hint: it is not. They are the same. More or less. But to get to that stage, I had to follow the tutorial from this thread. Which is mostly similar with other tutorials I already followed, from other forums or youtube. After some more adventures owned to my stupidity, it got me running OpenCL and gpuOwl eventually. After seeing that the Linux version is not faster at doing PRP and P-1, i said, ok, back to windows, and I was ready to plug the SSD and the stick, and put back the windoze HDDs. But, you know the drill, if we are here, let's see if the miner runs, before demolishing the house.... IT DID! I don't know what went wrong, and then, what went right, but we are mining ETH with ~90 Mhashes per second per card after we followed your tutorial on installing OpenCL. . No "speed patch" applied (there is a collection of them on the web, but they don't "smell good", we still need the cards in the future!, so we don't want the magic smoke out yet, therefore no undervoltage and no overclocking for now! - albeit people say such unorthodox tricks can bring you about 20-25% more hash rate). Edit: the million dollars question is if such unorthodox tricks can bring you 25% more GHzdays when crunching primes too... We may study that latter... :huh_don_t_we_have_a_smiley_turning_pages_?: [/offtopic] Last fiddled with by LaurV on 2021-05-27 at 05:01
I think the benefit will depend on the GPU model and the work. IIRC for disparate work between the two instances, performance gain may be less, or there may be a loss. (Running very different fft lengths may result in less throughput than a single instance of either length.)

Re the sometimes-two-instances-performance-penalty, in that case why not use a shell script so two instances ALTERNATE when one crashes or runs out of queued work. I suggest a short delay between and perhaps a maximum loop count. An A-B loop without either will inflate logs with lots of garbage otherwise when both instances are out of work or a driver or a lib or symlink has gone bonkers or the GPU has got into a must-crash-app state. (Cue the anything Windows can do, Linux can do better chorus...;)

Should the original post be updated from

This was the first stumbling block as I try to get an AMD device running on Ubuntu.

