![]() |
Indeed, Intel moved from solder to TIM with Ivy Bridge.
[URL="http://hexus.net/tech/news/cpu/39369-intel-cuts-corners-ivy-bridge-thermal-interface-material-tim/"]More details[/URL] Intel even has a [URL="http://www.google.com/patents/US7009289?printsec=description#v=onepage&q&f=false"]patent [/URL]for fluxless solder to heat-spreaders |
[QUOTE=sdbardwick;345568]Indeed, Intel moved from solder to TIM with Ivy Bridge.
[URL="http://hexus.net/tech/news/cpu/39369-intel-cuts-corners-ivy-bridge-thermal-interface-material-tim/"]More details[/URL] Intel even has a [URL="http://www.google.com/patents/US7009289?printsec=description#v=onepage&q&f=false"]patent [/URL]for fluxless solder to heat-spreaders[/QUOTE] So let me get this straight ... Intel charges ~$250 a pop for top-of-the-line CPUs ... with their volume manufacturing they could surely use top-quality liquid-metal-style TIM for less than $1 a chip ... they even patented such a technology ... only to abandon it in the latest, greatest and hottest chips. Grand, just grand. |
High-end (Xeon) chips might use [URL="http://www.guru3d.com/news_story/intel_ivy_bridge_e_has_solder_under_its_ihs.html"]solder[/URL].
|
[QUOTE]
Both the Indigo and Coollaboratory products are at FrozenCPU, but the Liquid Pro is out of stock. [URL="http://www.frozencpu.com/cat/l1/g8/Thermal_Interface.html"]http://www.frozencpu.com/cat/l1/g8/T...Interface.html[/URL][/QUOTE] I did not look closely enough. It appears that Liquid Pro [U]may[/U] have been replaced with Liquid Ultra. At least, the latter is in stock (and costs a couple of bucks more.) |
OK, here is the first comparisons of prime95 with FMA3 coding optimizations. Haswell at 4.2GHz, DDR3-2400 memory.
First, timings with version 27.9. [CODE]Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz CPU speed: 4200.00 MHz, 4 cores Prime95 64-bit version 27.9, RdtscTiming=1 Best time for 768K FFT length: 3.08 ms., avg: 3.15 ms. Best time for 896K FFT length: 3.80 ms., avg: 3.85 ms. Best time for 1024K FFT length: 4.22 ms., avg: 4.64 ms. Best time for 1280K FFT length: 5.42 ms., avg: 5.44 ms. Best time for 1536K FFT length: 6.62 ms., avg: 6.66 ms. Best time for 1792K FFT length: 7.94 ms., avg: 8.03 ms. Best time for 2048K FFT length: 8.82 ms., avg: 8.90 ms. Best time for 2560K FFT length: 11.38 ms., avg: 11.43 ms. Best time for 3072K FFT length: 13.86 ms., avg: 13.91 ms. Best time for 3584K FFT length: 16.95 ms., avg: 17.00 ms. Best time for 4096K FFT length: 18.78 ms., avg: 18.84 ms. Best time for 5120K FFT length: 24.69 ms., avg: 24.74 ms. Best time for 6144K FFT length: 29.37 ms., avg: 29.49 ms. Best time for 7168K FFT length: 35.22 ms., avg: 35.30 ms. Best time for 8192K FFT length: 41.07 ms., avg: 41.17 ms. LL test of M77000003 (4M FFT): One worker = 18.8 ms Two workers = 19.1 ms Three workers = 20.1 ms Four workers = 22.7 ms Small torture test temps (4 cores): 78 / 74 / 72 / 70 [/CODE] |
Now, with FMA3 enabled code (version 28.1). I still need to add FMA3 to the carry propagation code, so timings should get a smidge better.
[CODE]Intel(R) Core(TM) i5-4670K CPU @ 3.40GHz CPU speed: 4200.00 MHz Prime95 64-bit version 28.1, RdtscTiming=1 Best time for 768K FFT length: 2.655 ms., avg: 2.689 ms. Best time for 896K FFT length: 3.158 ms., avg: 3.266 ms. Best time for 1024K FFT length: 3.605 ms., avg: 3.808 ms. Best time for 1280K FFT length: 4.675 ms., avg: 4.707 ms. Best time for 1536K FFT length: 5.849 ms., avg: 5.908 ms. Best time for 1792K FFT length: 6.841 ms., avg: 6.855 ms. Best time for 2048K FFT length: 7.836 ms., avg: 7.863 ms. Best time for 2560K FFT length: 9.985 ms., avg: 10.015 ms. Best time for 3072K FFT length: 11.827 ms., avg: 11.884 ms. Best time for 3584K FFT length: 14.141 ms., avg: 14.161 ms. Best time for 4096K FFT length: 16.585 ms., avg: 16.650 ms. Best time for 5120K FFT length: 20.244 ms., avg: 20.795 ms. Best time for 6144K FFT length: 25.213 ms., avg: 25.235 ms. Best time for 7168K FFT length: 29.142 ms., avg: 29.162 ms. Best time for 8192K FFT length: 36.826 ms., avg: 36.936 ms. M77000003 One worker = 16.6 ms Two workers = 17.2 ms Three workers = 18.6 ms Four workers = 21.8 ms Small torture test temps: 86 / 83 / 80 / 78 [/CODE] Notice the temp increase on the small FFT torture test! These small FFTs operate out of the L2 cache. I'm going to try a really small torture test that operates out of the L1 cache to see if I can get the temps even higher. |
[QUOTE=Prime95;346462]...I'm going to try a really small torture test that operates out of the L1 cache to see if I can get the temps even higher.[/QUOTE]
George is a Scientist! (not that I ever doubted that. With capital S, for sure.) See the difference - [url]http://xkcd.com/242/[/url] |
Very nice!
I wonder how DDR4 will affect the performance once consumer chips start supporting it. |
And what will be performance increase with AVX2? ( if it will be any)
|
Any ideas as to why the time per iteration is going up every time you add a worker? When I did my tests, I had 0.011 seconds at one, two or three workers and only saw the drop to 0.014 seconds when I added the fourth.
|
Memory bottleneck.
|
| All times are UTC. The time now is 20:49. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.