mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2016-07-11, 12:21   #45
bgbeuning
 
Dec 2014

3·5·17 Posts
Default update

I moved the rx-480 card to a newer stable machine (has ddr4 so can't be too old)
but got similar unhappy results. mfakto 0.14 does good on the long self test but
almost always fails the short selftest. Compiling the source with SDK 3.0 gets
70 of 30,000 long self test fails.

This time I also tried compiling with SDK 2.9 but the resulting mfakto ran 30 times
slower.

I assume this is a software problem and the hardware is OK.
bgbeuning is offline   Reply With Quote
Old 2016-07-11, 13:43   #46
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

51710 Posts
Default

Quote:
Originally Posted by bgbeuning View Post
I moved the rx-480 card to a newer stable machine (has ddr4 so can't be too old)
but got similar unhappy results. mfakto 0.14 does good on the long self test but
almost always fails the short selftest. Compiling the source with SDK 3.0 gets
70 of 30,000 long self test fails.

This time I also tried compiling with SDK 2.9 but the resulting mfakto ran 30 times
slower.

I assume this is a software problem and the hardware is OK.
I have my RX480 in hand now, I will take a look at resolving this. I'm not convinced the VectorSize 2 is the right answer, if I recall mfakto uses a hardcoded list of cards to determine vector size and not some actual compatibility test, you may wish to force VectorSize 4 and GCN2 or 3 from mfakto 0.15pre5
airsquirrels is offline   Reply With Quote
Old 2016-07-11, 18:33   #47
bgbeuning
 
Dec 2014

3·5·17 Posts
Default

I will try that version.

I forgot to mention the first machine had Windows 7
and the second machine Ubuntu LTS 16 with new AMD driver.
bgbeuning is offline   Reply With Quote
Old 2016-07-13, 13:13   #48
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

49B16 Posts
Default

AMD has released new drivers that 'fix' power consumption:
http://www.anandtech.com/show/10477/...umption-issues

Driver version number is 16.7.1
Quotes from the Anandtech article:
Quote:
  1. Shift some of the power load off of the PCIe Graphics (PEG) slot connector in order to bring PEG slot power consumption within the PCIe spec. This doesn’t reduce total power consumption and performance is unaffected; power delivery is merely shifted. Based on earlier data this will put the 6-pin connector further over spec, but the vast majority of PSUs are very tolerant of this going out of spec.
  2. Because total power consumption of RX 480 can still exceed 150W – and as a result also exceed the limits for the 6-pin connector – AMD has also implemented an optional a “compatibility” toggle that reduces the total power consumption of the card. This is to better ensure that both the PEG slot and 6-pin power connector stay below their respective limits. Since the RX 480 is already throttling at times due to power limits, this does hurt performance, but it's also the most standards-compliant solution.
They also tested power consumption with the "compatibility mode on", and it lowered 18W under Crysis 3 and 13W under FurMark. Average GPU clockspeed in both cases was reduced by about 50MHz in that mode.
VictordeHolland is offline   Reply With Quote
Old 2016-07-13, 20:36   #49
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

21D216 Posts
Default

We are running the 16.7.2 driver.

Non-WHQL-64Bit-Radeon-Software-Crimson-16.7.2-Win10-Win8.1-Win7-July9.exe

Code:
Radeon Settings Version - 2016.0708.1511.25486
Driver Packaging Version - 16.20.1035.1001-160708a-304447E
Provider - Advanced Micro Devices, Inc.
2D Driver Version - 8.1.1.1558
Direct3D® Version - 9.14.10.1197
OpenGL® Version - 6.14.10.13441
OpenCL™ Version - 2.0.6.0
AMD Mantle Version - 9.1.10.123
AMD Mantle API Version - 98309
AMD Audio Driver Version - 10.0.0.3
Vulkan Driver Version - 1.2.0
Vulkan API Version - 1.0.17
Xyzzy is offline   Reply With Quote
Old 2016-07-20, 18:29   #50
bgbeuning
 
Dec 2014

3·5·17 Posts
Default

Any update "Rocky"?

I went looking for mfakto 0.15pre5 but only found it for windows
and my card is currently in a Linux box.
bgbeuning is offline   Reply With Quote
Old 2016-07-20, 21:57   #51
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11×47 Posts
Default

Quote:
Originally Posted by bgbeuning View Post
Any update "Rocky"?

I went looking for mfakto 0.15pre5 but only found it for windows
and my card is currently in a Linux box.
Real world work has been distracting, I should be able to get a look this weekend. Somewhere on the forums there is a Linux pre5 build posted that has my memory access fix. It should be those GCN versions.

Even with 0.14 you should be able to force the vector size and set it to GCN
airsquirrels is offline   Reply With Quote
Old 2016-07-31, 02:24   #52
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

51710 Posts
Default

I finally got my linux system up and running with the RX480.

Notes:
1. For testing this I am running latest Debian Sid with the 4.7.0 Kernel release configured with the built-in AMDGPU driver and latest linux-firmware polaris firmware, which is a bit ahead of the DKMS driver you'll get from the AMDGPU-PRO package. I was hoping this would mean more stability but alas, that does not seem to be the case.

2. I'm using the mfakto-pre5 branch with my local-memory GPU patch. This is the same build that works flawlessly on my fglrx systems.


OpenCL device info interestingly seems to be off, note that clinfo shows the same for the Fury Nano.
name Ellesmere (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 1.2 AMD-APP (2117.7) (2117.7 (VM))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 14 (896 compute elements)
clock rate 555MHz

mfakto self tests are failing 64-68 tests on both the RX480 and Fury Nano in the system, I'm suspecting a problem in the AMDGPU driver.

That said, for a 74 bit factor I'm seeing 476.34 GhzDay/Day vs ~600 for the Fury Nano (~20% slower)

clLucas tests show the card is pretty decent, performing just 15% slower than a Fury Nano/30% slower than a Fury X and kicking out a 4096K FFT result in less than 5 days.

RX480
Iteration 10000 M( 74207281 )C, 0xaa08c91f2f626775, n = 4096K, clLucas v1.04 err = 0.1416 (0:56 real, 5.6221 ms/iter, ETA 115:51:42)
Iteration 20000 M( 74207281 )C, 0xa216434787875d0f, n = 4096K, clLucas v1.04 err = 0.1416 (0:57 real, 5.6945 ms/iter, ETA 117:20:16)
Iteration 30000 M( 74207281 )C, 0x35b1ad9d5eba82cb, n = 4096K, clLucas v1.04 err = 0.1416 (0:57 real, 5.6536 ms/iter, ETA 116:28:49)

Fury Nano (Same Drivers)
Iteration 10000 M( 74207281 )C, 0xaa08c91f2f626775, n = 4096K, clLucas v1.04 err = 0.1416 (0:48 real, 4.8029 ms/iter, ETA 98:58:48)
Iteration 20000 M( 74207281 )C, 0xa216434787875d0f, n = 4096K, clLucas v1.04 err = 0.1416 (0:49 real, 4.8482 ms/iter, ETA 99:53:59)
Iteration 30000 M( 74207281 )C, 0x35b1ad9d5eba82cb, n = 4096K, clLucas v1.04 err = 0.1416 (0:48 real, 4.8154 ms/iter, ETA 99:12:38)

Fury X (old fglrx drivers)
Iteration 10000 M( 74207281 )C, 0xaa08c91f2f626775, n = 4096K, clLucas v1.04 err = 0.1416 (0:40 real, 3.9611 ms/iter, ETA 81:37:57)
Iteration 20000 M( 74207281 )C, 0xa216434787875d0f, n = 4096K, clLucas v1.04 err = 0.1416 (0:39 real, 3.9297 ms/iter, ETA 80:58:24)
Iteration 30000 M( 74207281 )C, 0x35b1ad9d5eba82cb, n = 4096K, clLucas v1.04 err = 0.1416 (0:40 real, 3.9166 ms/iter, ETA 80:41:37)


clLucas seems to work without any mistakes (residues matched for all of the first DC test I did.)
airsquirrels is offline   Reply With Quote
Old 2016-07-31, 03:37   #53
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

10058 Posts
Default

Sorry to reply to myself, but incase bdot stumbles in here...

So far I've established that there is some kind of corner case in the AMDGPU driver, which is required for the RX480 but also used for other cards. This causes some of the self test cases to fail for any card, even cards that work fine using the fglrx driver.

The trick is any given self test does not fail all the time, however not every test seems eligible to fail. For example M53017183 seems prone to failing but does not fail every time.

I am currently running 100 self tests overnight so I can compare the results for which tests tend to fail, and then I should be able to run some kernel traces to figure out what magical thread-scheduling or timing glitch is causing the failure and likely file a bug with AMD.
airsquirrels is offline   Reply With Quote
Old 2016-07-31, 22:26   #54
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11×47 Posts
Default

More information - Out of 300 iterations of 32k tests there only 152 exponents that ever fail, and there are a handful of exponents that fail 100% of the time.

M300050761
M30568231
M45448679
M45588523
M49346867
M52031087
M599501681
M67094119
M71065531
M71115521
M72067427
M74697017

The rest fail some percent of the time between 1 and 100 with an even distribution. I was able to isolate from the trace of M300050761 that the failure appears to be with the OpenCL barriers in the GPU sieve - the failing AMDGPU driver shows that sieving is still happening while the TF is going on and the bit count of sieved candidates varies per execution instead of showing the correct value after properly completing the sieve before the TF step. Enabling trace logging of level 5 on the sieve kernel introduces enough of a delay to ensure that all the self tests pass. I'm looking into the sieve barriers to see if there is an easy workaround.

Last fiddled with by airsquirrels on 2016-07-31 at 22:27
airsquirrels is offline   Reply With Quote
Reply



All times are UTC. The time now is 15:02.


Fri Jul 7 15:02:43 UTC 2023 up 323 days, 12:31, 0 users, load averages: 1.70, 1.33, 1.20

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔