20181113, 22:51  #12  
"Composite as Heck"
Oct 2017
317_{16} Posts 
Quote:
Large L4 cache is the rumour, 256MiB if I remember rightly but there's no confirmation. It is a massive die which people are speculating could be designed modularly enough that the io die for Ryzen will be essentially a quarter corner of Epyc's. The cache if it exists could be in 64MiB chunks for this reason. L4 cache probably exists in some capacity but I have no idea what realistic power, capacity or bandwidth estimates would look like. It's said that GPUGPU IF bandwidth is 100GB/s ( https://www.anandtech.com/show/13578...rkpapermaster ), if it's the same deal here (a big if, common sense says it's probably lower) that gives us an upper bound of 100GB/s per 8 core chiplet (so an upper bound of 800GB/s for an 8 chiplet Epyc). That's probably way off so pinch of salt. They'll probably confirm the details in January, that's the next details milestone AFAIK and cache may be the big reveal. 

20181114, 00:08  #13 
Feb 2016
UK
3×139 Posts 
https://www.anandtech.com/show/13598...awkat235ghz
This deployment lists their Rome CPUs running at 2.35GHz. The numbers seem to work out for peak FLOPS too. 640000 (cores) * 2.35 (GHz) * 16 (two 256bit AVX units doing FMA) = 24.064 PF  does it work that way? I can't get the ram to add up exactly, but it is near enough 64GB/CPU but the difference doesn't seem to be GiBGB conversion. Have they declared how many ram channels it supports? If the rumours about a fat L4 come true, that should help a LOT with memory bandwidth. I loved it in the desktop Broadwell CPUs, and would love it revisited with a current CPU. 
20181114, 09:44  #14 
"Composite as Heck"
Oct 2017
7×113 Posts 
Epyc supports 8 channels, Threadripper 4 and Ryzen 2 as they are using the same sockets as zen for zen2 and zen3. They indicate that they'll use the same sockets for zen4 even and only change socket for zen5 to support DDR5 and whatever else is current, but I'm skeptical as to how far they can push CPU designs without upping the memory channel count. Maybe DDR4 4000 CL16 or thereabouts will be a common thing when zen3 is around but even that is not a massive upgrade over what we have now.

20181119, 02:56  #15 
"/X\(‘‘)/X\"
Jan 2013
101101110011_{2} Posts 

20181119, 03:20  #16 
Romulan Interpreter
Jun 2011
Thailand
22252_{8} Posts 
That is amazing news... First the FPU enhancement "to catch up with Intel", then the GPU and FPGA "on die" options. If what they say is what we will get, this almost makes us willing to try to switch back to AMD, after... (how long?) 21 year of Intel... hehe...
However, "half power for the same performance or 1.25 performance for the same power" does not seems quite efficient with heat dissipation... If you can achieve "half power for the same performance", then you put two of them in the box and (considering some thermal inconvenient) you would be able to get at least 1.85 performance for the same power... Otherwise something is fishy with the design... Last fiddled with by LaurV on 20181119 at 03:26 
20181119, 09:04  #17 
Undefined
"The unspeakable one"
Jun 2006
My evil lair
17EF_{16} Posts 
I'm fairly sure those figures are for the 7nm node, not for the final dice (dies?). Since power is proportional to f^{2+}v, and v goes up in proportion to f, then those figures seem within the right range.

20181119, 10:00  #18  
Feb 2016
UK
417_{10} Posts 
Quote:
I've always wondered about how power usage scaled. To my thinking, it should be proportional to frequency, and proportional to voltage squared, if the semiconductors behave like a resistive load. It is the last part I was less sure about. I suspect there is additional nonlinearity with voltage due to being semiconductor. 

20181119, 16:20  #19  
Undefined
"The unspeakable one"
Jun 2006
My evil lair
11×557 Posts 
Quote:


20181120, 09:17  #20 
Romulan Interpreter
Jun 2011
Thailand
10010010101010_{2} Posts 
The statement has nothing to do with voltage or current. Yes, the power is voltage times current, and from the Ohm's law, current is voltage over resistance, which makes the power be the voltage squared divided by the resistance.
But all that is nonsequitur. You have a box that takes a watt to do some work. If you get a new box able to do the same work for half watt, then you can take two of the new boxes and have double amount of work done for that watt. This assumes the boxes are totally independent and also the work (tasks) they do are independent. In reality they are not, and using one of the boxes influence the environment of the other (heat, vibration, smog, whatever  here we talk mainly about heat, if you put two boxes together and they only produce 1.25 of the work, then the rest of 0.75 is lost, which translate into low efficiency resulted from combining the two boxes together). Last fiddled with by LaurV on 20181120 at 09:21 
20181120, 09:31  #21  
Undefined
"The unspeakable one"
Jun 2006
My evil lair
11×557 Posts 
Quote:
That isn't the same as simply doubling the number of transistors for twice the power and twice the throughput (but not twice the frequency). That is a whole different problem because now you have to split your workload into two pieces and not all workloads can be parallelised. Last fiddled with by retina on 20181120 at 09:31 

20181120, 11:36  #22 
Jun 2003
11×449 Posts 
Only if your algorithm is parallelizable. You've been doing too much GIMPS  everything now looks like distributed computing to you

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
RX470 and RX460 announced  VictordeHolland  GPU Computing  0  20160730 13:05 
Intel Xeon D announced  VictordeHolland  Hardware  7  20150311 23:26 
Factoring details  mturpin  Information & Answers  4  20130208 02:43 
Euler (6,2,5) details.  Death  Math  10  20110803 13:49 
Larrabee instruction set announced  fivemack  Hardware  0  20090325 12:09 