mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-01-12, 17:02   #12
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

3·269 Posts
Default

Quote:
Originally Posted by M344587487 View Post
Fingers crossed. If it has the full 1:2 ratio does that mean we can potentially saturate the memory at lower core clocks, or even do TF with the extra headroom with higher clocks? I wonder if it's possible to assign some CU's to gpuowl and others to mfakto, is SR-IOV needed for that or equivalent? I have doubts SR-IOV would make it to the consumer version.
From some guy on TWITTER !?!?!?!?!?!?

From Anandtech:

"on paper the new card only has a 9% compute throughput advantage. So it’s not on compute throughput where Radeon VII’s real winning charm lies"

Last fiddled with by tServo on 2019-01-12 at 17:04
tServo is offline   Reply With Quote
Old 2019-01-12, 17:46   #13
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Quote:
Originally Posted by tServo View Post
From some guy on TWITTER !?!?!?!?!?!?
"Some guy on Twitter" = Editor in Chief for Anandtech...

And the quote refers to FP32 performance. Later on in the same article though,
"The Vega 20 GPU does bring new compute features – particularly much higher FP64 compute throughput and new low-precision modes well-suited for neural network inferencing – but these features aren’t something consumers are likely to use."

Last fiddled with by nomead on 2019-01-12 at 17:54 Reason: added quote
nomead is offline   Reply With Quote
Old 2019-01-14, 09:49   #14
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

11101101102 Posts
Default

https://techgage.com/news/radeon-vii...4-performance/


https://www.hardocp.com/article/2019...tt_herkelman/2
M344587487 is offline   Reply With Quote
Old 2019-01-14, 12:14   #15
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Awww... That's it then, unfortunately my interest stopped right there.
nomead is offline   Reply With Quote
Old 2019-01-14, 19:51   #16
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/

C7016 Posts
Default

We'll have to see what it turns out to be. Ryan Smith specifically asked about it.

https://www.reddit.com/r/Amd/comment...apped/ee1jr5k/
Mark Rose is offline   Reply With Quote
Old 2019-01-17, 00:19   #17
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

26·7 Posts
Default

https://twitter.com/RyanSmithAT/stat...80805802733568

He's got the answer back, it's 1:8 rate.
mackerel is offline   Reply With Quote
Old 2019-01-17, 02:23   #18
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/

24×199 Posts
Default

Quote:
Originally Posted by mackerel View Post
https://twitter.com/RyanSmithAT/stat...80805802733568

He's got the answer back, it's 1:8 rate.
Still a shame it's so crippled.
Mark Rose is offline   Reply With Quote
Old 2019-01-17, 09:39   #19
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22×3×112 Posts
Default

Quote:
Originally Posted by mackerel View Post
https://twitter.com/RyanSmithAT/stat...80805802733568

He's got the answer back, it's 1:8 rate.
Interesting, that's double the DP rate of "classic Vega" (Vega64, Vega56). While a bit disappointing compared to 1:2 DP, may still be a good improvement in PRP especially matched with the higher-bandwidth RAM.
preda is offline   Reply With Quote
Old 2019-01-17, 09:57   #20
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×52×19 Posts
Default

Quote:
Originally Posted by preda View Post
Interesting, that's double the DP rate of "classic Vega" (Vega64, Vega56). While a bit disappointing compared to 1:2 DP, may still be a good improvement in PRP especially matched with the higher-bandwidth RAM.
Am I right in thinking that DP rate is the bottleneck for Vega 64 but that memory bandwidth comes a close second? Is it as simple as saying that for R7 to roughly match 2x Vega 64 throughput at the same clocks, it needed both double DP rate and double bandwidth (ignoring 4 CU difference)? Any potential bottlenecks other than those two? Other than higher is better I don't know how the specs translate into performance.
M344587487 is offline   Reply With Quote
Old 2019-01-17, 10:48   #21
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22·3·112 Posts
Default

Quote:
Originally Posted by M344587487 View Post
Am I right in thinking that DP rate is the bottleneck for Vega 64 but that memory bandwidth comes a close second? Is it as simple as saying that for R7 to roughly match 2x Vega 64 throughput at the same clocks, it needed both double DP rate and double bandwidth (ignoring 4 CU difference)? Any potential bottlenecks other than those two? Other than higher is better I don't know how the specs translate into performance.
2x would be amazing. In practice I would be very happy if I see a 50% speedup.

About memory, it is my impression that the latency did not improve much, but the bandwidth doubled. But to take advantage of this, better occupancy would be required (double the number of memory operations in flight), and this is not easily achievable because of other limiting resources: LDS memory and nb. of registers (VGPRs) that remain unchanged I guess.

About compute, the parts that aren't DP (e.g. pointer arithmetic, other integer e.g. carry, logic) remain unchanged, and this will reduce the observed speedup.

IMO another limiting factor for GCN performance is still the compiler, after so many years: the compiler does a rather poor job at generating highly efficient code (not an easy task I agree).

OTOH the better cooling will help, and allow the card to be higher clocked without thermal throttling (which is a problem on Vega64 blower cooler)

Last fiddled with by preda on 2019-01-17 at 10:51
preda is offline   Reply With Quote
Old 2019-01-17, 11:37   #22
SELROC
 

25·7·29 Posts
Default

Quote:
Originally Posted by preda View Post
2x would be amazing. In practice I would be very happy if I see a 50% speedup.

About memory, it is my impression that the latency did not improve much, but the bandwidth doubled. But to take advantage of this, better occupancy would be required (double the number of memory operations in flight), and this is not easily achievable because of other limiting resources: LDS memory and nb. of registers (VGPRs) that remain unchanged I guess.

About compute, the parts that aren't DP (e.g. pointer arithmetic, other integer e.g. carry, logic) remain unchanged, and this will reduce the observed speedup.

IMO another limiting factor for GCN performance is still the compiler, after so many years: the compiler does a rather poor job at generating highly efficient code (not an easy task I agree).

OTOH the better cooling will help, and allow the card to be higher clocked without thermal throttling (which is a problem on Vega64 blower cooler)

I am procrastinating the buy a new more powerful gpu, do you have any plans to optimize gpuowl for large numbers ?
  Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Vega 20 announced with 7.64 TFlops of FP64 M344587487 GPU Computing 4 2018-11-08 16:56
GTX 1180 Mars Volta consumer card specs leaked tServo GPU Computing 20 2018-06-24 08:04
RX Vega performance xx005fs GPU Computing 5 2018-01-17 00:22
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 15:38.


Fri Jul 7 15:38:34 UTC 2023 up 323 days, 13:07, 0 users, load averages: 1.60, 1.25, 1.13

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔