mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2018-11-13, 22:51   #12
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

31716 Posts
Default

Quote:
Originally Posted by xx005fs View Post
I saw the image of the delidded Epyc 7nm 64 core part in which the central IO die seems massive. Could there possibly be a really fast L4 cache that's a decent size and with very high bandwidth (aka much higher than around 150GB/s from 2666 8 channel memory)

Large L4 cache is the rumour, 256MiB if I remember rightly but there's no confirmation. It is a massive die which people are speculating could be designed modularly enough that the io die for Ryzen will be essentially a quarter corner of Epyc's. The cache if it exists could be in 64MiB chunks for this reason. L4 cache probably exists in some capacity but I have no idea what realistic power, capacity or bandwidth estimates would look like. It's said that GPU-GPU IF bandwidth is 100GB/s ( https://www.anandtech.com/show/13578...rk-papermaster ), if it's the same deal here (a big if, common sense says it's probably lower) that gives us an upper bound of 100GB/s per 8 core chiplet (so an upper bound of 800GB/s for an 8 chiplet Epyc). That's probably way off so pinch of salt. They'll probably confirm the details in January, that's the next details milestone AFAIK and cache may be the big reveal.
M344587487 is offline   Reply With Quote
Old 2018-11-14, 00:08   #13
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

3×139 Posts
Default

https://www.anandtech.com/show/13598...awk-at-235-ghz

This deployment lists their Rome CPUs running at 2.35GHz. The numbers seem to work out for peak FLOPS too. 640000 (cores) * 2.35 (GHz) * 16 (two 256-bit AVX units doing FMA) = 24.064 PF - does it work that way?

I can't get the ram to add up exactly, but it is near enough 64GB/CPU but the difference doesn't seem to be GiB-GB conversion. Have they declared how many ram channels it supports?

If the rumours about a fat L4 come true, that should help a LOT with memory bandwidth. I loved it in the desktop Broadwell CPUs, and would love it revisited with a current CPU.
mackerel is offline   Reply With Quote
Old 2018-11-14, 09:44   #14
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

7×113 Posts
Default

Epyc supports 8 channels, Threadripper 4 and Ryzen 2 as they are using the same sockets as zen for zen2 and zen3. They indicate that they'll use the same sockets for zen4 even and only change socket for zen5 to support DDR5 and whatever else is current, but I'm skeptical as to how far they can push CPU designs without upping the memory channel count. Maybe DDR4 4000 CL16 or thereabouts will be a common thing when zen3 is around but even that is not a massive upgrade over what we have now.
M344587487 is offline   Reply With Quote
Old 2018-11-19, 02:56   #15
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

1011011100112 Posts
Default

https://fuse.wikichip.org/news/1815/...zen-2-details/
Mark Rose is offline   Reply With Quote
Old 2018-11-19, 03:20   #16
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

222528 Posts
Default

That is amazing news... First the FPU enhancement "to catch up with Intel", then the GPU and FPGA "on die" options. If what they say is what we will get, this almost makes us willing to try to switch back to AMD, after... (how long?) 21 year of Intel... hehe...

However, "half power for the same performance or 1.25 performance for the same power" does not seems quite efficient with heat dissipation... If you can achieve "half power for the same performance", then you put two of them in the box and (considering some thermal inconvenient) you would be able to get at least 1.85 performance for the same power... Otherwise something is fishy with the design...

Last fiddled with by LaurV on 2018-11-19 at 03:26
LaurV is offline   Reply With Quote
Old 2018-11-19, 09:04   #17
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

17EF16 Posts
Default

Quote:
Originally Posted by LaurV View Post
However, "half power for the same performance or 1.25 performance for the same power" does not seems quite efficient with heat dissipation...
I'm fairly sure those figures are for the 7nm node, not for the final dice (dies?). Since power is proportional to f2+v, and v goes up in proportion to f, then those figures seem within the right range.
retina is offline   Reply With Quote
Old 2018-11-19, 10:00   #18
mackerel
 
mackerel's Avatar
 
Feb 2016
UK

41710 Posts
Default

Quote:
Originally Posted by LaurV View Post
However, "half power for the same performance or 1.25 performance for the same power" does not seems quite efficient with heat dissipation... If you can achieve "half power for the same performance", then you put two of them in the box and (considering some thermal inconvenient) you would be able to get at least 1.85 performance for the same power... Otherwise something is fishy with the design...
The statement had me confused when I first saw it, then in a different slide, one of the "powers" was replaced by "frequency". That kinda makes more sense to me, if the old process was running up to its limit, which was raised on the new one, then significant gains could be seen like that. But it would probably only apply at a specific frequency.

Quote:
Originally Posted by retina View Post
Since power is proportional to f2+v, and v goes up in proportion to f, then those figures seem within the right range.
I've always wondered about how power usage scaled. To my thinking, it should be proportional to frequency, and proportional to voltage squared, if the semiconductors behave like a resistive load. It is the last part I was less sure about. I suspect there is additional non-linearity with voltage due to being semiconductor.
mackerel is offline   Reply With Quote
Old 2018-11-19, 16:20   #19
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

11×557 Posts
Default

Quote:
Originally Posted by mackerel View Post
I've always wondered about how power usage scaled. To my thinking, it should be proportional to frequency, and proportional to voltage squared, if the semiconductors behave like a resistive load. It is the last part I was less sure about. I suspect there is additional non-linearity with voltage due to being semiconductor.
You might be right about that. v2+f.
retina is offline   Reply With Quote
Old 2018-11-20, 09:17   #20
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100100101010102 Posts
Default

The statement has nothing to do with voltage or current. Yes, the power is voltage times current, and from the Ohm's law, current is voltage over resistance, which makes the power be the voltage squared divided by the resistance.

But all that is non-sequitur.

You have a box that takes a watt to do some work. If you get a new box able to do the same work for half watt, then you can take two of the new boxes and have double amount of work done for that watt. This assumes the boxes are totally independent and also the work (tasks) they do are independent. In reality they are not, and using one of the boxes influence the environment of the other (heat, vibration, smog, whatever - here we talk mainly about heat, if you put two boxes together and they only produce 1.25 of the work, then the rest of 0.75 is lost, which translate into low efficiency resulted from combining the two boxes together).

Last fiddled with by LaurV on 2018-11-20 at 09:21
LaurV is offline   Reply With Quote
Old 2018-11-20, 09:31   #21
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

11×557 Posts
Default

Quote:
Originally Posted by LaurV View Post
You have a box that takes a watt to do some work. If you get a new box able to do the same work for half watt, then you can take two of the new boxes and have double amount of work done for that watt. This assumes the boxes are totally independent and also the work (tasks) they do are independent. In reality they are not, and using one of the boxes influence the environment of the other (heat, vibration, smog, whatever - here we talk mainly about heat, if you put two boxes together and they only produce 1.25 of the work, then the rest of 0.75 is lost, which translate into low efficiency resulted from combining the two boxes together).
But it isn't about the box. It is about the transistor. You can't make a single transistor do twice the work for only double the power, because the voltage has to be scaled up to cope with the higher frequency.

That isn't the same as simply doubling the number of transistors for twice the power and twice the throughput (but not twice the frequency). That is a whole different problem because now you have to split your workload into two pieces and not all workloads can be parallelised.

Last fiddled with by retina on 2018-11-20 at 09:31
retina is offline   Reply With Quote
Old 2018-11-20, 11:36   #22
axn
 
axn's Avatar
 
Jun 2003

11×449 Posts
Default

Quote:
Originally Posted by LaurV View Post
You have a box that takes a watt to do some work. If you get a new box able to do the same work for half watt, then you can take two of the new boxes and have double amount of work done for that watt.
Only if your algorithm is parallelizable. You've been doing too much GIMPS - everything now looks like distributed computing to you
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
RX470 and RX460 announced VictordeHolland GPU Computing 0 2016-07-30 13:05
Intel Xeon D announced VictordeHolland Hardware 7 2015-03-11 23:26
Factoring details mturpin Information & Answers 4 2013-02-08 02:43
Euler (6,2,5) details. Death Math 10 2011-08-03 13:49
Larrabee instruction set announced fivemack Hardware 0 2009-03-25 12:09

All times are UTC. The time now is 19:35.

Thu Apr 22 19:35:04 UTC 2021 up 14 days, 14:15, 0 users, load averages: 1.44, 1.80, 1.92

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.