![]() |
|
|
#133 | |
|
1BCB16 Posts |
Quote:
Just had a power outage, when power came back one machine rebooted and when running gpuowl had the same problem as root, "clGetDeviceIDs ..." etc. I rebooted the machine and after that things were back to pristine and I could run gpuowl as root. so first, maybe this has to do with some corrupted inode that got repaired on second reboot? I don't know. second, it was a transient error, diagnosing it may be complicated. Last fiddled with by SELROC on 2019-06-27 at 19:09 |
|
|
|
|
#134 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
I might should post in the unhappy me thread instead. I bought a second Radeon VII (this one is XFX brand, the first was an ASRock). I think I lost the good chip lottery. Before I RMA it, I'll ask here if there are other settings I should try. BTW, I believe it is unethical to return a product that works as advertised at stock settings.
I bought a refurbed Corsair Platinum 1000W power supply to insure solid power delivery to the card. Stock voltage is 1121mV. Running two instances at perf level 4, fan at 180, no memory overclocking, I get gpuowl errors a couple of times a day. Upping the voltage to 1200mV still gets gpuowl errors. Note that perf level 4 uses 918mV at stock settings and 937mV when overvolting. Temps are 95C, power used is 217 watts. I briefly tried perf level 3 without success. I've not tried underclocking memory. Last fiddled with by Prime95 on 2019-06-28 at 18:55 |
|
|
|
|
|
#135 | |
|
22·13·67 Posts |
Quote:
My Radeon VII is Gigabyte brand with a CoolerMaster 1200W PSU. They guarantee that the board works as advertised at stock settings, when you overclock you run outside stock parameters, and they say it "this may void the guarantee". IMHO the goal is to compute with the less errors possible, so running at stock settings is not bad even with a slightly lower performance. About the all-zeroes residue error: I have not found how to reproduce the error, probably my Radeon VII just deserves a better mainboard. |
|
|
|
|
#136 | |
|
"Mihai Preda"
Apr 2015
22×3×112 Posts |
Quote:
|
|
|
|
|
|
|
#137 | |
|
100010010100102 Posts |
Quote:
I also suspect my radeon vii is buggy, in addition to the all-zero residue error, it now starts to do bad computations, with gpuowl signaling EE and reloading, but without the all-zero residue. |
|
|
|
|
#138 |
|
"Mihai Preda"
Apr 2015
22·3·112 Posts |
|
|
|
|
|
|
#139 |
|
2×17×151 Posts |
|
|
|
|
#140 | |
|
57248 Posts |
Quote:
My radeon VII is extremely sensitive to ambient temperature. With case Fan: 89M exponent, 909 us/sq Without case Fan: same exponent, 915 us/sq |
|
|
|
|
#141 |
|
Bemusing Prompter
"Danny"
Dec 2002
California
1001110010002 Posts |
Do newer versions of GCN offer any advantages over the 3rd generation that would benefit GIMPS?
Last fiddled with by ixfd64 on 2019-07-02 at 22:08 |
|
|
|
|
|
#142 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
Quote:
@SELROC: My buggy GPU has both non-zero and all-zero residue errors. |
|
|
|
|
|
|
#143 | |
|
76528 Posts |
Quote:
Yes, I understand perfectly, and such errors start to be a common thing. My guess is that the R7 is really sensitive to temperature, I keep the hvac on, I get errors but not every day, there are days that pass fine without errors. Another recurring error is an amdgpu PowerPlay bug, this one is hard, needs machine power cycle. |
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Vega 20 announced with 7.64 TFlops of FP64 | M344587487 | GPU Computing | 4 | 2018-11-08 16:56 |
| GTX 1180 Mars Volta consumer card specs leaked | tServo | GPU Computing | 20 | 2018-06-24 08:04 |
| RX Vega performance | xx005fs | GPU Computing | 5 | 2018-01-17 00:22 |
| Radeon Pro Duo | 0PolarBearsHere | GPU Computing | 0 | 2016-03-15 01:32 |
| AMD Radeon R9 295X2 | firejuggler | GPU Computing | 33 | 2014-09-03 21:42 |