![]() |
Joules per iteration (power efficiency)
I'm curious how my rig compares to others in terms of power efficiency.
- Haswell i5-4690k stock clock, -100mV undervolted - 2x8GB DDR3-1600 CL 9-9-9-24 at 1.35V - 80+Gold PSU (BeQuiet E9 400W) mprime 29.3: FFT 4096k: 85W/145.15it/s = 586mJ/it FFT 4480k: 83W/130.50it/s = 636mJ/it |
You'll get a lot more efficiency if you upgrade your RAM. Your CPU is memory bandwidth starved. I'd suggest DDR3-2400 minimum, but faster would be better.
You could also underclock to less than 3 GHz and undervolt significantly. Try it and see how your power efficiency improves. |
I've tried setting the RAM to 1866 (+16.6%) which required 1.5V instead of 1.35V. The performance increased by 6.8% but power consumption went up by 18%.
Those 6.8% aren't woth the risk for memory corruption from running out of specs I guess. And getting new old DDR3 is just too pricey. |
Mark has done a lot of work on under-clocking and under-volting. Power efficiency goes up, you get a better match between CPU and RAM speeds, and the CPU consequently stalls less waiting for RAM.
|
[QUOTE=heliosh;470662]I've tried setting the RAM to 1866 (+16.6%) which required 1.5V instead of 1.35V. The performance increased by 6.8% but power consumption went up by 18%.
Those 6.8% aren't woth the risk for memory corruption from running out of specs I guess. And getting new old DDR3 is just too pricey.[/QUOTE] I wouldn't suggest doing anything that requires an overvolt. But you could try underclocking your CPU and dropping your vcore 0.3 or more. I have a cluster of machines where I did that and saved 25 or 30 watts each. |
I'm not that much interested to improve power efficiency. I just wanted to know how it compares to other systems, especially newer ones (Kaby Lake, Coffee Lake, Ryzen)
|
[QUOTE=heliosh;470681]I'm not that much interested to improve power efficiency. I just wanted to know how it compares to other systems, especially newer ones (Kaby Lake, Coffee Lake, Ryzen)[/QUOTE]
the problem I think it power efficient will be variable. I don't run the software, but as you bring up price of things gets in the way. my electricity bill is about $22 in electricity usage but a base charge of nearly that amount plus taxes at times. running different setups will give different results even on the same chip style. |
[QUOTE=heliosh;470681]I'm not that much interested to improve power efficiency. I just wanted to know how it compares to other systems, especially newer ones (Kaby Lake, Coffee Lake, Ryzen)[/QUOTE]
Yeah. You just stumbled into a pool of a particularly hot topic in some parts, here. :smile: EDIT: Are the 83w and 85w CPU power? I am looking at my i6700 at 4.3GHz, all cores running in one P95 worker, 2400K FFT. HWInfo shows CPU Power between 90 and 103 W. It also shows CPU Package Power as 104W. The latter value varies with CPU load. Would it be counted as part of the calculation of mJ/it? What about the ~5 additional watts which the RAM draws when P95 is running? What is the conversion between W/xIterations/sec, and mJ/it? Another edit: OK. I see where the mJ/it comes from. Please disregard the last question. One more: Using the 103W upper figure for CPU power on a Skylake, and ~2.4 ms/it, I get 417 it/sec, and 249 mJ/it. Again, in the DC range, 2400K FFT. |
[QUOTE=heliosh;470681]I'm not that much interested to improve power efficiency. I just wanted to know how it compares to other systems, especially newer ones (Kaby Lake, Coffee Lake, Ryzen)[/QUOTE]
There isn't much difference between Haswell and Coffee Lake, clock for clock. Haswell was a big bump up over Ivy Bridge. The improvements since Haswell, at least for mprime, have almost all been in increased memory speed, thus increased memory bandwidth. Ryzen cores have half the throughput of Haswell cores. But twice the cores will still be bottlenecked by limited memory bandwidth. The new sweet spot will be the i3-8100 with dual rank, dual channel DDR4-2400, once the cheap motherboards are out. |
The Coffee Lake i3 is basically a Kaby Lake die. The i5 and i7 (6 core) on the other hand are manufactured in a new process (14nm++) which is said to be a bit more efficient:
[QUOTE]A third improved process, "14nm++", is set to begin in late 2017 and will further allow for +23-24% higher drive current for 52% less power vs the original 14nm process. [/QUOTE] [url]https://en.wikichip.org/wiki/14_nm_lithography_process#Intel[/url] The i5-8400 isn't that much more expensive than the i3-8100. |
[QUOTE=heliosh;470690]The Coffee Lake i3 is basically a Kaby Lake die. The i5 and i7 (6 core) on the other hand are manufactured in a new process (14nm++) which is said to be a bit more efficient:
[url]https://en.wikichip.org/wiki/14_nm_lithography_process#Intel[/url] The i5-8400 isn't that much more expensive than the i3-8100.[/QUOTE] The i5-8400 is an interesting chip. It has a very low base clock, which is because it can't maintain AVX2/FMA workloads under 65 watts at higher frequencies. So the i3 gives 4 * 3.6 = 14.4 GHz cores and the i5 gives 6 * 2.8 = 16.8. That extra 17% in CPU throughput probably won't help at all since the memory bandwidth situation still isn't changed. But you've spent a lot more for the CPU. I don't have the parts to actually benchmark throughput either, keep in mind. If you want to save power, I'd look at the higher clocked parts, like the Coffee Lake i3 equivalent of the i5-6600. These should undervolt better when underclocked. I don't have any experience with Zen-based processors to suggest one for efficiency. |
[QUOTE=heliosh;470658]I'm curious how my rig compares to others in terms of power efficiency.
- Haswell i5-4690k stock clock, -100mV undervolted - 2x8GB DDR3-1600 CL 9-9-9-24 at 1.35V - 80+Gold PSU (BeQuiet E9 400W) mprime 29.3: FFT 4096k: 85W/145.15it/s = 586mJ/it FFT 4480k: 83W/130.50it/s = 636mJ/it[/QUOTE] My (and probably Mark's too) Kaby Lake i5 rig with single rank DDR-2400 (see thread on George's dream build) give or take a few % since numbers are from memory yields: FFT 4096k: 420W/(7*170it/s) = 353mJ/it |
[QUOTE=Prime95;470694]My (and probably Mark's too) Kaby Lake i5 rig with single rank DDR-2400 (see thread on George's dream build) give or take a few % since numbers are from memory yields:
FFT 4096k: [B]420W[/B]/(7*170it/s) = 353mJ/it[/QUOTE] Is that 420W measured at the line, George? |
[QUOTE=kladner;470697]Is that 420W measured at the line, George?[/QUOTE]
Seems about right: 60 watts per system. I'm just slightly higher than that, but I haven't undervolted as much as possible. |
[QUOTE=Mark Rose;470700]Seems about right: 60 watts per system. I'm just slightly higher than that, but I haven't undervolted as much as possible.[/QUOTE]
The Kil-a-Watt says I am drawing 488 W, but that includes 7 HDDs, 2 GPUs (460 and 1060) doing TF, as well as the CPU. As I mentioned previously, the reports I get on the CPU include two values: CPU Power and CPU Package Power. The only way to make these numbers correlate with the 79-80 amps at 1.296 Vcore is to add them together. I think the Package Power must correspond to the "uncore" part of the processor, while CPU Power refers to the cores themselves. Using this argument, my actual processor power is CPU 104 W + Uncore 114 W = 218 W. So 218/417 puts the system at 523 mJ/it. Add in the 6 W for the RAM and that goes to 537 mJ/it. This is versus 247mJ/it with just the core power considered. |
[QUOTE=kladner;470702]
Using this argument, my actual processor power is CPU 104 W + Uncore 114 W = 218 W. [/QUOTE] 114W uncore is like a lot! Are you sure it doesn't include the CPU cores also? So 114-104W = 10W uncore? 114W+6W (RAM) / 0.85 (decent efficient power supply) = 141W at the wall? |
[QUOTE=VictordeHolland;471280]114W uncore is like a lot! Are you sure it doesn't include the CPU cores also? So 114-104W = 10W uncore?
114W+6W (RAM) / 0.85 (decent efficient power supply) = 141W at the wall?[/QUOTE] I wondered about that, too. However, Vcore=1.296 x 80 amps = [B]103.68 W [/B]shows that I should have questioned my original calculations and assumptions more carefully. You are correct. The actual PSU is Platinum, running somewhat below half capacity. I expect it is pretty efficient. 120/.9=133 W 104+10/.9=127 W |
Can anyone recommend a multicore dev-board or compact compute package based on the ARM Cortex A72? I'd be interested to see how that compares in terms of this metric to an efficient Intel avx2 system.
Another question: Has anyone seen any numbers on how much power key individual arithmetic ops require on some laeding architectures? Especially FMA versus FADD is a comparison of interest. |
[QUOTE=ewmayer;471509]Can anyone recommend a multicore dev-board or compact compute package based on the ARM Cortex A72?[/QUOTE]
The only dev board based on Cortex A72 that I could find is based on the Mediatek X20 (2×A72 +8×A53) but it costs $ 200... [url]https://www.96boards.org/product/mediatek-x20/[/url] |
[QUOTE=VictordeHolland;471515]The only dev board based on Cortex A72 that I could find is based on the Mediatek X20 (2×A72 +8×A53) but it costs $ 200...
[url]https://www.96boards.org/product/mediatek-x20/[/url][/QUOTE] Yes, I saw that one too, obviously pricier than my humble A53-based odroid - though not on a total-FLOPS basis - but lowest cost option I've seen so far. Perhaps Tom Womack, who did the A57 timings over in the Mlucas-for-ARM thread, has or can get access to an A72 by way of his working at ARM. He specifically mentioned possible access to some Cavium Thunder-X machines, but not clear to me which precise Cortex processor those SoC server processors are based on, A72, or an earlier one. [It's not made clear in their PR, e.g. [url=http://www.cavium.com/ThunderX2_ARM_Processors.html]here[/url].] Just PMed him a followup about that. The Cavium servers are specifically designed as a low-power competitor to Intel's Xeon server line, i.e. as a business-class embodiment of the theme of this thread. |
It looks like that Helio X20 based board doesn't support Linux. That's quite sad but many ARM boards only have Android support and very bad Linux support (if any).
Firefly RK3399 board supports Linux: [URL]https://www.amazon.com/Firefly-RK3399-Computer-Reference-Development/dp/B06XSCT11S[/URL] No first hand experience though :smile: BTW ThunderX CPU is proprietary. |
A73 based boards are also worth considering. This gets 4 of the big cores rather than 2.
[url]https://www.96boards.org/product/hikey960/[/url] |
| All times are UTC. The time now is 07:01. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.