mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-01-09, 19:00   #1
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2·349 Posts
Default Radeon VII (2nd gen consumer Vega GPU)

They just announced this in the CES keynote. pertinent bullet points:
  • 7nm process
  • 1TB/s memory bandwidth
  • 16GB HBM2
  • Slide shows +62% OpenCL performance over Vega64, whatever that means
  • RRP of $699
  • ETA February 7th
  • 60 CU's, so it looks to be a cut-down MI50
That's over twice the memory bandwidth of Vega64. If the bandwidth can be saturated that's twice the performance of a Vega64 for roughly the same (current) price as 2xVega64, at what is hopefully a much lower power consumption than 2xVega 64. Does that analysis sound about right?

GW: See post #76 and #195 for quick-start on setting up gpuowl under Linux.

Last fiddled with by Prime95 on 2020-01-14 at 03:18
M344587487 is online now   Reply With Quote
Old 2019-01-09, 21:47   #2
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

3·131 Posts
Default

The big display behind Lisa said 25% more performance at same power - in what? gaming? While it loses some CUs vs Vega, it gains in clock more than offsetting it. I guess the question then is, how is a particular workload affected by bandwidth?

My guess, whatever efficiency benefits they got from process, they spent on clock. So maybe they'll stick to similar board power for the overall higher absolute performance.
mackerel is offline   Reply With Quote
Old 2019-01-09, 22:35   #3
xx005fs
 
"Eric"
Jan 2018
USA

211 Posts
Default

Quote:
Originally Posted by mackerel View Post
The big display behind Lisa said 25% more performance at same power - in what? gaming? While it loses some CUs vs Vega, it gains in clock more than offsetting it. I guess the question then is, how is a particular workload affected by bandwidth?

My guess, whatever efficiency benefits they got from process, they spent on clock. So maybe they'll stick to similar board power for the overall higher absolute performance.
I would assume 25-40% better on gaming depending on the game. As long as they kept 1/2 rate DP like the MI50, it should be the best card on the market to do PRP/LL. It's also gonna be the best value indefinitely because even if the v100s beats it, they would cost so much more that makes them not worthy.

Last fiddled with by xx005fs on 2019-01-09 at 22:36
xx005fs is offline   Reply With Quote
Old 2019-01-09, 23:32   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10010011001012 Posts
Default

Quote:
Originally Posted by M344587487 View Post
They just announced this in the CES keynote. pertinent bullet points:
  • 7nm process
  • 1TB/s memory bandwidth
  • 16GB HBM2
  • Slide shows +62% OpenCL performance over Vega64, whatever that means
  • RRP of $699
  • ETA February 7th
  • 60 CU's, so it looks to be a cut-down MI50
That's over twice the memory bandwidth of Vega64. If the bandwidth can be saturated that's twice the performance of a Vega64 for roughly the same (current) price as 2xVega64, at what is hopefully a much lower power consumption than 2xVega 64. Does that analysis sound about right?
Interesting!
Any power dissipation numbers?
A dual-slot-width card?
Does it require pcie 3.0?
kriesel is offline   Reply With Quote
Old 2019-01-09, 23:46   #5
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·31·47 Posts
Default

Quote:
Originally Posted by xx005fs View Post
I would assume 25-40% better on gaming depending on the game. As long as they kept 1/2 rate DP like the MI50, it should be the best card on the market to do PRP/LL. It's also gonna be the best value indefinitely because even if the v100s beats it, they would cost so much more that makes them not worthy.
At Ars Technica they say the Vega 20 GPU is a die shrink of the Vega 10 GPU found in the Vega 64, so it's probably 1:16.
Mark Rose is online now   Reply With Quote
Old 2019-01-10, 01:37   #6
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

3×181 Posts
Default

Quote:
Originally Posted by kriesel View Post
Interesting!
Any power dissipation numbers?
A dual-slot-width card?
Does it require pcie 3.0?
Anandtech estimates 300W power.
Dual width, 3 fans that exhaust heat within the case.
It looks like it is higher than the end io bracket.
I've never seen any card that requires pcie 3.0 .

http://www.anandtech.com/show/13832/...ry-7th-for-699
tServo is offline   Reply With Quote
Old 2019-01-10, 01:48   #7
xx005fs
 
"Eric"
Jan 2018
USA

3238 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
At Ars Technica they say the Vega 20 GPU is a die shrink of the Vega 10 GPU found in the Vega 64, so it's probably 1:16.
It's a die shrink of Vega indeed. However, it is the same GPU as the MI50 which have 1/2 DP capabilities, and unless AMD botched that feature on the consumer variant, it would be able to be the king of LL/PRP
xx005fs is offline   Reply With Quote
Old 2019-01-10, 06:13   #8
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

69810 Posts
Default

Quote:
Originally Posted by kriesel View Post
...
Does it require pcie 3.0?
It's a GFX9 card so no: https://github.com/RadeonOpenCompute/ROCm
Quote:
Originally Posted by ROCm git readme
As described above, GFX8 GPUs require PCIe 3.0 with PCIe atomics in order to run ROCm. In particular, the CPU and every active PCIe point between the CPU and GPU require support for PCIe 3.0 and PCIe atomics. The CPU root must indicate PCIe AtomicOp Completion capabilities and any intermediate switch must indicate PCIe AtomicOp Routing capabilities.

...

Beginning with ROCm 1.8, GFX9 GPUs (such as Vega 10) no longer require PCIe atomics. We have similarly opened up more options for number of PCIe lanes. GFX9 GPUs can now be run on CPUs without PCIe atomics and on older PCIe generations, such as PCIe 2.0. This is not supported on GPUs below GFX9, e.g. GFX8 cards in the Fiji and Polaris families.
M344587487 is online now   Reply With Quote
Old 2019-01-10, 06:51   #9
SELROC
 

2×3×5×97 Posts
Default

Quote:
Originally Posted by M344587487 View Post

The pcie 1.0 slots are limited in speed, this affects GEC speed. Faster with pcie 3.0
  Reply With Quote
Old 2019-01-12, 12:48   #10
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

23×41 Posts
Default

https://twitter.com/RyanSmithAT/stat...59608371175424

"FP64 is not among the couple of features they dialed back for the consumer card."
So if this is indeed true, that gets me a bit excited.
nomead is offline   Reply With Quote
Old 2019-01-12, 13:17   #11
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2·349 Posts
Default

Quote:
Originally Posted by nomead View Post
https://twitter.com/RyanSmithAT/stat...59608371175424

"FP64 is not among the couple of features they dialed back for the consumer card."
So if this is indeed true, that gets me a bit excited.
Fingers crossed. If it has the full 1:2 ratio does that mean we can potentially saturate the memory at lower core clocks, or even do TF with the extra headroom with higher clocks? I wonder if it's possible to assign some CU's to gpuowl and others to mfakto, is SR-IOV needed for that or equivalent? I have doubts SR-IOV would make it to the consumer version.
M344587487 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Vega 20 announced with 7.64 TFlops of FP64 M344587487 GPU Computing 4 2018-11-08 16:56
GTX 1180 Mars Volta consumer card specs leaked tServo GPU Computing 20 2018-06-24 08:04
RX Vega performance xx005fs GPU Computing 5 2018-01-17 00:22
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 22:25.

Thu Nov 26 22:25:47 UTC 2020 up 77 days, 19:36, 4 users, load averages: 1.72, 1.72, 1.64

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.