mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Haswell Preview Benchmark (https://www.mersenneforum.org/showthread.php?t=17982)

sdbardwick 2013-07-07 19:32

Indeed, Intel moved from solder to TIM with Ivy Bridge.
[URL="http://hexus.net/tech/news/cpu/39369-intel-cuts-corners-ivy-bridge-thermal-interface-material-tim/"]More details[/URL]

Intel even has a [URL="http://www.google.com/patents/US7009289?printsec=description#v=onepage&q&f=false"]patent [/URL]for fluxless solder to heat-spreaders

ewmayer 2013-07-07 19:41

[QUOTE=sdbardwick;345568]Indeed, Intel moved from solder to TIM with Ivy Bridge.
[URL="http://hexus.net/tech/news/cpu/39369-intel-cuts-corners-ivy-bridge-thermal-interface-material-tim/"]More details[/URL]

Intel even has a [URL="http://www.google.com/patents/US7009289?printsec=description#v=onepage&q&f=false"]patent [/URL]for fluxless solder to heat-spreaders[/QUOTE]

So let me get this straight ... Intel charges ~$250 a pop for top-of-the-line CPUs ... with their volume manufacturing they could surely use top-quality liquid-metal-style TIM for less than $1 a chip ... they even patented such a technology ... only to abandon it in the latest, greatest and hottest chips. Grand, just grand.

sdbardwick 2013-07-07 19:55

High-end (Xeon) chips might use [URL="http://www.guru3d.com/news_story/intel_ivy_bridge_e_has_solder_under_its_ihs.html"]solder[/URL].

kladner 2013-07-08 12:31

[QUOTE]
Both the Indigo and Coollaboratory products are at FrozenCPU, but the Liquid Pro is out of stock.
[URL="http://www.frozencpu.com/cat/l1/g8/Thermal_Interface.html"]http://www.frozencpu.com/cat/l1/g8/T...Interface.html[/URL][/QUOTE]

I did not look closely enough. It appears that Liquid Pro [U]may[/U] have been replaced with Liquid Ultra. At least, the latter is in stock (and costs a couple of bucks more.)

Prime95 2013-07-16 18:53

OK, here is the first comparisons of prime95 with FMA3 coding optimizations. Haswell at 4.2GHz, DDR3-2400 memory.

First, timings with version 27.9.

[CODE]Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz
CPU speed: 4200.00 MHz, 4 cores
Prime95 64-bit version 27.9, RdtscTiming=1
Best time for 768K FFT length: 3.08 ms., avg: 3.15 ms.
Best time for 896K FFT length: 3.80 ms., avg: 3.85 ms.
Best time for 1024K FFT length: 4.22 ms., avg: 4.64 ms.
Best time for 1280K FFT length: 5.42 ms., avg: 5.44 ms.
Best time for 1536K FFT length: 6.62 ms., avg: 6.66 ms.
Best time for 1792K FFT length: 7.94 ms., avg: 8.03 ms.
Best time for 2048K FFT length: 8.82 ms., avg: 8.90 ms.
Best time for 2560K FFT length: 11.38 ms., avg: 11.43 ms.
Best time for 3072K FFT length: 13.86 ms., avg: 13.91 ms.
Best time for 3584K FFT length: 16.95 ms., avg: 17.00 ms.
Best time for 4096K FFT length: 18.78 ms., avg: 18.84 ms.
Best time for 5120K FFT length: 24.69 ms., avg: 24.74 ms.
Best time for 6144K FFT length: 29.37 ms., avg: 29.49 ms.
Best time for 7168K FFT length: 35.22 ms., avg: 35.30 ms.
Best time for 8192K FFT length: 41.07 ms., avg: 41.17 ms.

LL test of M77000003 (4M FFT):
One worker = 18.8 ms
Two workers = 19.1 ms
Three workers = 20.1 ms
Four workers = 22.7 ms

Small torture test temps (4 cores): 78 / 74 / 72 / 70
[/CODE]

Prime95 2013-07-16 18:56

Now, with FMA3 enabled code (version 28.1). I still need to add FMA3 to the carry propagation code, so timings should get a smidge better.

[CODE]Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz
CPU speed: 4200.00 MHz
Prime95 64-bit version 28.1, RdtscTiming=1
Best time for 768K FFT length: 2.655 ms., avg: 2.689 ms.
Best time for 896K FFT length: 3.158 ms., avg: 3.266 ms.
Best time for 1024K FFT length: 3.605 ms., avg: 3.808 ms.
Best time for 1280K FFT length: 4.675 ms., avg: 4.707 ms.
Best time for 1536K FFT length: 5.849 ms., avg: 5.908 ms.
Best time for 1792K FFT length: 6.841 ms., avg: 6.855 ms.
Best time for 2048K FFT length: 7.836 ms., avg: 7.863 ms.
Best time for 2560K FFT length: 9.985 ms., avg: 10.015 ms.
Best time for 3072K FFT length: 11.827 ms., avg: 11.884 ms.
Best time for 3584K FFT length: 14.141 ms., avg: 14.161 ms.
Best time for 4096K FFT length: 16.585 ms., avg: 16.650 ms.
Best time for 5120K FFT length: 20.244 ms., avg: 20.795 ms.
Best time for 6144K FFT length: 25.213 ms., avg: 25.235 ms.
Best time for 7168K FFT length: 29.142 ms., avg: 29.162 ms.
Best time for 8192K FFT length: 36.826 ms., avg: 36.936 ms.

M77000003
One worker = 16.6 ms
Two workers = 17.2 ms
Three workers = 18.6 ms
Four workers = 21.8 ms

Small torture test temps: 86 / 83 / 80 / 78
[/CODE]


Notice the temp increase on the small FFT torture test! These small FFTs operate out of the L2 cache. I'm going to try a really small torture test that operates out of the L1 cache to see if I can get the temps even higher.

Batalov 2013-07-16 19:00

[QUOTE=Prime95;346462]...I'm going to try a really small torture test that operates out of the L1 cache to see if I can get the temps even higher.[/QUOTE]
George is a Scientist! (not that I ever doubted that. With capital S, for sure.)
See the difference - [url]http://xkcd.com/242/[/url]

ixfd64 2013-07-16 19:32

Very nice!

I wonder how DDR4 will affect the performance once consumer chips start supporting it.

pepi37 2013-07-16 21:40

And what will be performance increase with AVX2? ( if it will be any)

TheMawn 2013-07-16 22:21

Any ideas as to why the time per iteration is going up every time you add a worker? When I did my tests, I had 0.011 seconds at one, two or three workers and only saw the drop to 0.014 seconds when I added the fourth.

kracker 2013-07-16 22:39

Memory bottleneck.


All times are UTC. The time now is 20:49.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.