![]() |
|
|
#1 |
|
Dec 2007
22×7 Posts |
2.4GHZ Dual Core benches 3x faster than 2.6GHZ Quad Core...
That's hard to swallow - comments/suggestions appreciated: Machine A: DG965WH (Dual Core) Intel(R) Core(TM)2 CPU E6600 @ 2.40GHz CPU speed: 2397.81 MHz L1 cache size: 32 KB L1 cache line size: 64 bytes Prime95 32-bit version 24.14, RdtscTiming=1 Best time for 512K FFT length: 11.660 ms. Best time for 640K FFT length: 15.684 ms. Best time for 768K FFT length: 19.009 ms. Best time for 896K FFT length: 22.677 ms. Best time for 1024K FFT length: 25.295 ms. Best time for 1280K FFT length: 31.690 ms. Best time for 1536K FFT length: 38.507 ms. Best time for 1792K FFT length: 45.824 ms. Best time for 2048K FFT length: 51.163 ms. Best time for 2560K FFT length: 66.867 ms. Best time for 3072K FFT length: 82.881 ms. Best time for 3584K FFT length: 100.318 ms. Best time for 4096K FFT length: 112.121 ms. Machine B: X6DHE-XB (Quad Core) Intel(R) Xeon(R) CPU E5430 @ 2.66GHz CPU speed: 2660.61 MHz L1 cache size: 32 KB L1 cache line size: 64 bytes Prime95 32-bit version 24.14, RdtscTiming=1 Best time for 512K FFT length: 32.836 ms. Best time for 640K FFT length: 44.817 ms. Best time for 768K FFT length: 56.451 ms. Best time for 896K FFT length: 67.419 ms. Best time for 1024K FFT length: 78.335 ms. Best time for 1280K FFT length: 97.779 ms. Best time for 1536K FFT length: 119.686 ms. Best time for 1792K FFT length: 141.810 ms. Best time for 2048K FFT length: 156.129 ms. Best time for 2560K FFT length: 199.925 ms. Best time for 3072K FFT length: 242.916 ms. Best time for 3584K FFT length: 296.264 ms. Best time for 4096K FFT length: 336.755 ms. On the dual-core, cores 1..2 were equally busy. On the quad-core, cores 1..8 were equally busy. (The quad-core machine runs two Xeons) |
|
|
|
|
|
#2 |
|
Aug 2002
North San Diego Coun
821 Posts |
Set processor affinity from the advanced menu, stop and restart Prime95 and retest.
|
|
|
|
|
|
#3 | |
|
Dec 2007
22·7 Posts |
Quote:
There was a 20% improvement, which is great, but things are still 2x slower than the 2.4 GHz core2 machine. |
|
|
|
|
|
|
#4 |
|
May 2005
22×11×37 Posts |
What kind of RAM do you use in each system + what are the chipsets?
I suspect that Xeon system does not have sufficient memory bandwidth... |
|
|
|
|
|
#5 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
May I ask exactly what motherboard and what memory configuration you're using with the dual Xeon E5430 chips?
I presume the chips are http://www.newegg.com/Product/Produc...82E16819117145 - the new 45nm low-power-consumption ones with the large caches, which ought to be spectacular performers. But your original post says you're using an X6DHE-XB board, and as far as I can tell the Xeon 5430 chips don't fit in such a board (they're 771-pin chips, and if the board is http://supermicro.com/products/mothe...0/X6DHE-XB.cfm then it takes 604-pin chips); do you have some sort of adapter, or are you using a different board? |
|
|
|
|
|
#6 |
|
Aug 2002
North San Diego Coun
11001101012 Posts |
Following up on fivemack's inquiry, off the top of my head those Xeon timings are about right for a 2 socket hyperthreading dual core system.
Try benchmarking with just one instance of Prime95 (with affinity set to core 0) so we can get a baseline for comparison. |
|
|
|
|
|
#7 |
|
Jan 2003
2·103 Posts |
Probably memory bottleneck... more and more cores sharing a limited amount of bandwidth. Even with the Q6600 quads, we see the performance level off once we put on the 3rd core. And with 2 xeons, you even double the number of cores on top of that.
Prime95 is unique in that the CPU optimisation has been done so well that once you exceed 3 cores, the memory is getting maxed out. Don't see this in F@H. Last fiddled with by db597 on 2007-12-28 at 00:42 |
|
|
|
|
|
#8 |
|
Dec 2007
22×7 Posts |
Yeah, most of you were right - memory bandwidth.
First, I had the motherboard wrong, it's a SuperMicro X7DCL-i which uses the Intel 5100 controller. The bad about this controller is that it channels both CPU memory requests to a single memory path. The memory bandwidth to each CPU is effectively halved. Fortunately, it seems as if newer motherboards use the Intel 5400 controller, which has two paths to RAM. I'm pretty sure this will totally cure my issue. Of course it will cost me a new motherboard... For those who asked, I did do some single-thread benchmarks, which were a little better than the Core2 Duo benchmarks posted earlier. Also, I used 2x 2GB DDR2-667 memory sticks, the fastest supported by the motherboard. Thanks for all the feedback, I'd say as far as I'm concerned, the issue is closed: I used a motherboard that more than halved the available memory bandwidth of the E5430 processors. Finally, I used the new mprime255 to benchmark 1-8 threads: [Thu Dec 27 19:52:47 2007] Compare your results to other computers at http://www.mersenne.org/bench.htm Intel(R) Xeon(R) CPU E5430 @ 2.66GHz CPU speed: 2660.72 MHz, 8 cores CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2 L1 cache size: 32 KB L2 cache size: 6144 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 256 Prime95 32-bit version 25.5, RdtscTiming=1 Best time for 768K FFT length: 15.942 ms. Best time for 896K FFT length: 19.521 ms. Best time for 1024K FFT length: 22.337 ms. Best time for 1280K FFT length: 29.252 ms. Best time for 1536K FFT length: 36.162 ms. Best time for 1792K FFT length: 43.346 ms. Best time for 2048K FFT length: 48.503 ms. Best time for 2560K FFT length: 64.157 ms. Best time for 3072K FFT length: 78.098 ms. Best time for 3584K FFT length: 93.058 ms. Best time for 4096K FFT length: 104.233 ms. Best time for 5120K FFT length: 132.860 ms. Best time for 6144K FFT length: 160.176 ms. Best time for 7168K FFT length: 193.282 ms. Best time for 8192K FFT length: 212.017 ms. Timing FFTs using 2 threads. Best time for 768K FFT length: 8.467 ms. Best time for 896K FFT length: 10.374 ms. Best time for 1024K FFT length: 12.455 ms. Best time for 1280K FFT length: 15.679 ms. Best time for 1536K FFT length: 19.330 ms. Best time for 1792K FFT length: 23.183 ms. Best time for 2048K FFT length: 26.141 ms. Best time for 2560K FFT length: 34.424 ms. Best time for 3072K FFT length: 42.137 ms. Best time for 3584K FFT length: 49.854 ms. Best time for 4096K FFT length: 56.262 ms. Best time for 5120K FFT length: 72.103 ms. Best time for 6144K FFT length: 86.699 ms. Best time for 7168K FFT length: 104.224 ms. Best time for 8192K FFT length: 115.461 ms. Timing FFTs using 3 threads. Best time for 768K FFT length: 10.528 ms. Best time for 896K FFT length: 12.265 ms. Best time for 1024K FFT length: 18.825 ms. Best time for 1280K FFT length: 16.041 ms. Best time for 1536K FFT length: 19.485 ms. Best time for 1792K FFT length: 23.058 ms. Best time for 2048K FFT length: 26.033 ms. Best time for 2560K FFT length: 33.656 ms. Best time for 3072K FFT length: 40.724 ms. Best time for 3584K FFT length: 47.987 ms. Best time for 4096K FFT length: 54.553 ms. Best time for 5120K FFT length: 68.906 ms. Best time for 6144K FFT length: 86.584 ms. Best time for 7168K FFT length: 99.504 ms. Best time for 8192K FFT length: 112.586 ms. Timing FFTs using 4 threads. Best time for 768K FFT length: 9.835 ms. Best time for 896K FFT length: 10.840 ms. Best time for 1024K FFT length: 16.450 ms. Best time for 1280K FFT length: 12.603 ms. Best time for 1536K FFT length: 15.426 ms. Best time for 1792K FFT length: 18.184 ms. Best time for 2048K FFT length: 20.463 ms. Best time for 2560K FFT length: 26.485 ms. Best time for 3072K FFT length: 32.119 ms. Best time for 3584K FFT length: 37.634 ms. Best time for 4096K FFT length: 43.152 ms. Best time for 5120K FFT length: 53.849 ms. Best time for 6144K FFT length: 65.468 ms. Best time for 7168K FFT length: 78.517 ms. Best time for 8192K FFT length: 88.147 ms. Timing FFTs using 5 threads. Best time for 768K FFT length: 10.280 ms. Best time for 896K FFT length: 11.450 ms. Best time for 1024K FFT length: 17.234 ms. Best time for 1280K FFT length: 11.790 ms. Best time for 1536K FFT length: 13.872 ms. Best time for 1792K FFT length: 16.275 ms. Best time for 2048K FFT length: 18.288 ms. Best time for 2560K FFT length: 23.749 ms. Best time for 3072K FFT length: 28.409 ms. Best time for 3584K FFT length: 33.396 ms. Best time for 4096K FFT length: 38.152 ms. Best time for 5120K FFT length: 47.313 ms. Best time for 6144K FFT length: 57.611 ms. Best time for 7168K FFT length: 68.433 ms. Best time for 8192K FFT length: 77.099 ms. Timing FFTs using 6 threads. Best time for 768K FFT length: 9.409 ms. Best time for 896K FFT length: 10.248 ms. Best time for 1024K FFT length: 15.880 ms. Best time for 1280K FFT length: 11.403 ms. Best time for 1536K FFT length: 13.095 ms. Best time for 1792K FFT length: 14.730 ms. Best time for 2048K FFT length: 16.522 ms. Best time for 2560K FFT length: 20.779 ms. Best time for 3072K FFT length: 25.392 ms. Best time for 3584K FFT length: 29.921 ms. Best time for 4096K FFT length: 34.275 ms. Best time for 5120K FFT length: 42.664 ms. Best time for 6144K FFT length: 51.845 ms. Best time for 7168K FFT length: 60.954 ms. Best time for 8192K FFT length: 70.019 ms. Timing FFTs using 7 threads. Best time for 768K FFT length: 9.708 ms. Best time for 896K FFT length: 10.763 ms. Best time for 1024K FFT length: 16.670 ms. Best time for 1280K FFT length: 11.166 ms. Best time for 1536K FFT length: 12.901 ms. Best time for 1792K FFT length: 14.920 ms. Best time for 2048K FFT length: 16.754 ms. Best time for 2560K FFT length: 20.669 ms. Best time for 3072K FFT length: 24.509 ms. Best time for 3584K FFT length: 28.760 ms. Best time for 4096K FFT length: 32.811 ms. Best time for 5120K FFT length: 40.536 ms. Best time for 6144K FFT length: 48.947 ms. Best time for 7168K FFT length: 58.237 ms. Best time for 8192K FFT length: 67.027 ms. Timing FFTs using 8 threads. Best time for 768K FFT length: 9.306 ms. Best time for 896K FFT length: 10.199 ms. Best time for 1024K FFT length: 15.993 ms. Best time for 1280K FFT length: 11.299 ms. [Thu Dec 27 19:57:48 2007] Best time for 1536K FFT length: 12.763 ms. Best time for 1792K FFT length: 14.494 ms. Best time for 2048K FFT length: 16.334 ms. Best time for 2560K FFT length: 20.261 ms. Best time for 3072K FFT length: 24.059 ms. Best time for 3584K FFT length: 28.201 ms. Best time for 4096K FFT length: 31.842 ms. Best time for 5120K FFT length: 39.215 ms. Best time for 6144K FFT length: 46.868 ms. Best time for 7168K FFT length: 55.041 ms. Best time for 8192K FFT length: 62.832 ms. Best time for 58 bit trial factors: 3.884 ms. Best time for 59 bit trial factors: 3.866 ms. Best time for 60 bit trial factors: 3.840 ms. Best time for 61 bit trial factors: 3.871 ms. Best time for 62 bit trial factors: 6.580 ms. Best time for 63 bit trial factors: 6.589 ms. Best time for 64 bit trial factors: 6.054 ms. Best time for 65 bit trial factors: 6.016 ms. Best time for 66 bit trial factors: 6.021 ms. Best time for 67 bit trial factors: 6.010 ms. |
|
|
|
|
|
#9 | ||
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
Quote:
Quote:
|
||
|
|
|
|
|
#10 | |
|
Dec 2007
22·7 Posts |
Quote:
|
|
|
|
|
|
|
#11 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
If going to the faster motherboard made the machine run mprime twice as quickly, would you mind if it drew 408W?
The faster motherboard and FBDIMMs will draw more power, particularly at idle (I'd guess it might be as bad as 160W idle and 300W flat-out); but that's not particularly important if this is a machine whose goal in life is to run eight mprimes 24/7, and which will be idle only by unfortunate accident. I'm slightly wondering why in that case you chose to use a hard disc rather than having the OS on a USB stick; prime95 doesn't use much disc space. 100W is 2.4 kWh a day, which is 20p a day from my quite expensive electricity supplier; seventy pounds a year, not really significant given the cost of quad-core Xeons. Last fiddled with by fivemack on 2007-12-28 at 22:24 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Magnitude 5.6 Earthquake in Silicon Valley | ewmayer | Science & Technology | 66 | 2008-07-31 15:30 |
| Intel core2 Duo sieving? | cipher | Twin Prime Search | 15 | 2007-06-05 21:20 |
| Another Core2 Duo question | Ender | Hardware | 3 | 2007-02-08 00:12 |
| Silicon Valley to Receive Free Wi-Fi - New York Times | ewmayer | Lounge | 0 | 2006-09-06 19:48 |
| CPUs as Art - How to Expose the Bare Silicon? | ewmayer | Hardware | 7 | 2005-10-19 19:48 |