![]() |
|
|
#155 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23×271 Posts |
+1
One more thing, make sure the heatsink is *properly* placed, I've done that before when it was not completely secured... And make sure to use good grease for it. |
|
|
|
|
|
#156 |
|
"Oliver"
Mar 2005
Germany
11·101 Posts |
kladner: those AMDs have bigger die size thus they can move heat out of the silicon easilier (315mm2 vs. 177mm2). Of course I have doublechecked the heatsink.
I'm affraid that I'm very good in putting some load on the CPU... in some overclocking forums I've noticed some persons which claim they can run their 4770k @4.5GHz, 1.4V easily on air while running LinX (Linpack for Windows)... well they have used an old LinX which (I guess) only does SSE. At 4.5GHz the screenshot revealed ~60GFLOPS... Back to my issue: seems that the PCU (Power Controlling Unit, part of the CPU) and BIOS (OK, OK EFI) aren't on my side. With default settings in BIOS the system does 3.9GHz 4-core turbo under heavy load (exceeding the TDP easily)... The CPU should do up to 3.7GHz 4-core turbo to stay within spec. For each step above non-turbo multiplier the PCU adds some voltage. And there are some comments on the web that for AVX code it adds even more voltage. I've measured ~1.26v under load (~1.1v default vCore). With voltage manually set to 1.100v and 4GHz I was able to keep the CPU temperatures at ~75°C while running Linpack. Oliver |
|
|
|
|
|
#157 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
Thanks for the details on voltage and temperature. I know you would have checked the heatsinking, but over 90 C is in the borderlands even for Intel: startling, I would call it, even under extreme loads with anything but a stock cooler. To mention it is on the order of asking "Is the power cord connected?" The Linpack version you reference must be one mean mofo!
|
|
|
|
|
|
#158 |
|
"Oliver"
Mar 2005
Germany
21278 Posts |
I'll continue on this next weekend, perhaps (for comparison) I should check the temperatures while running mprime. A fine tweaked Linpack (hpl-2.0 + Intel MKL 11.someversion, properly choosen parameters for HPL and process pinning) is my worst case scenario for temperature and power consumption. If a system can do this I feel pretty comfortable with real world applications. Linpack makes heavy use of the new dual FMA capability of the haswell chips (16 DP ops per clock and core).
Edit: the Windows "LinX" isn't that bad if you choose the right version (AVX-capable, check performance, for comparison: I can do 200GFLOPS with 4 cores @3.9GHz on my system) if you want to give it a try, much easier than compiling the whole stuff by yourself. Oliver Last fiddled with by TheJudger on 2013-06-30 at 23:02 |
|
|
|
|
|
#159 | |
|
∂2ω=0
Sep 2002
República de California
1164710 Posts |
Quote:
|
|
|
|
|
|
|
#160 | |
|
∂2ω=0
Sep 2002
República de California
19·613 Posts |
Quote:
Anyhoo, rejiggering the prototype code shouldn't be too hard, just a lot of swapping out what goes into various register-copy temporaries. Still annoyed @myself for wasting my own time, though. Will try to use the extra work to also do some 2nd-pass optimization, so as to not make it feel entirely redundant - save a few register copies and improve the instruction scheduling to better hide latency. |
|
|
|
|
|
|
#161 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
1D7716 Posts |
Quote:
I wrote a MASM macro that takes 4 args and outputs the optional register copy and the appropriate 132, 231, 213 version of the FMA instruction. |
|
|
|
|
|
|
#162 |
|
∂2ω=0
Sep 2002
República de California
19×613 Posts |
Ah, very good - nice of Intel to at least provide some options here, given that they don't (yet) support the desired FMA4 syntax.
|
|
|
|
|
|
#163 |
|
"Oliver"
Mar 2005
Germany
21278 Posts |
my 4770k - continued
I see two options:
i7 4770k + Gigabyte Z87X-UD3H + 2x 8GiB DDR3-2133 1.50V + 1x SATA HDD + 1x SATA SSD + Thermalright HR-02 Macho + Noctua NF-P12 @full speed, 80minus power supply temporary build open on table, ambient temperature ~22°C BIOS settings: Gigabytes BIOS defaults, voltages set to "normal", hyperthreading disabled, memory set to "XMP Profile 1" OS: openSUSE 12.3, 64bits of course Optimized HPL (Linpack) making heavy usage of AVX+FMA: 210W measured on AC, CPU reports ~120W, CPU temperatures 92-95°C Prime95 (mprime v27.9), "blend test": 150-170W measured on AC, CPU reports 70-90W, CPU temperatures 60-72°C Prime95 out indicates that it is using AVX FFTs, power and temperatures varies over different FFT lengths while HPL power consumption is very stable. Oliver Last fiddled with by TheJudger on 2013-07-05 at 19:18 |
|
|
|
|
|
#164 | |
|
∂2ω=0
Sep 2002
República de California
19×613 Posts |
Quote:
You note that Linpack is making heavy use of AVX2 (i.e. AVX+FMA) - there is one obvious difference between it and Prime95, which George is busy adding FMA-usage to, but your version uses just FMA-less AVX. AVX2 effectively doubles the floating-MUL bandwidth (also ADD, but the MUL is the biggie here) - those MULs generate a lot of heat. Also, linear algebra tends to be able to use the FPU at much closer to max. theoretical capacity than FFTs, because the data access patterns are much simpler and the arithmetic mix is more favorable in the sense that optimized FFTs are ADD-dominated. George, have you noticed any temperature impact from using FMA in your development code? |
|
|
|
|
|
|
#165 |
|
"Oliver"
Mar 2005
Germany
111110 Posts |
Hi Ernst,
yepp, HPL is a power virus (but not the worst I can imagin*) This perfect explains why I have so much trouble with the cooling of my CPU while other say that they can handle the heat of a Haswell. In the past (few years ago) Prime95s power consumption/heat generation was close to Linpack but today... *I guess running only DGEMM (BLAS) on reasonable sized inputs is even worse than Linpack. Linpack spents much time in this standard function but not all of the time, there are other calls to the BLAS library and communication between processes/threads aswell. I'm using Intel MKL as BLAS implementation, those functions are designed for optimal performance (not for maximum power consumption/heat generation). So I guess Intel wont say "don't run this code on our CPUs, it is just a stupid power virus". Oliver P.S. I've just improved my HPL settings: 1-2W more, 203GFLOPS @default clock Last fiddled with by TheJudger on 2013-07-05 at 19:59 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Haswell-E Prelim. Benchmark | sdbardwick | Hardware | 37 | 2015-02-10 18:49 |
| Prime95 and Haswell | Pleco | Information & Answers | 22 | 2014-07-13 16:03 |
| Haswell Rig | Mini-Geek | Hardware | 64 | 2014-05-27 13:22 |
| Prime95 version 27.1 early preview, not-even-close-to-beta release | Prime95 | Software | 126 | 2012-02-09 16:17 |
| Missing mouse-over preview text | retina | Forum Feedback | 1 | 2011-09-12 15:32 |