mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2007-05-17, 11:36   #12
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

642410 Posts
Default

Quote:
Originally Posted by ATH View Post
Yes, but Barcelona/Phenom is based on 65nm. At the end of 2007 Intel should release its Penryn processor, a true quad core based on 45nm technology.

http://www.pcworld.com/article/id,13...l/article.html
'true quad core' is a marketing term invented by AMD. Intel uses two dual-core dice at 65nm, and will use two dual-core dice at 45nm, because it's a little cheaper to manufacture two small dice and put them in an expensive package than to manufacture one large die.

For Prime95-like jobs where there is nothing shared between the four cores, the slightly faster inter-core communications on AMD's chip are irrelevant; generally if you're constrained by inter-core communication latency you've designed your software poorly.
fivemack is offline   Reply With Quote
Old 2007-05-17, 12:00   #13
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·1,579 Posts
Default

From old QX6700 review: http://techreport.com/reviews/2006q4...0/index.x?pg=1

Quote:
As a multi-chip package, the QX6700 contains two copies of a relatively well-integrated dual-core design. The two cores on each chip share a 4MB L2 cache between them, complete with dynamic partitioning and the ability to hand off ownership of data from one core to the next. Unfortunately, the integration between the QX6700's two chips is less than ideal.

Although they occupy the same package, their only means of communication is the system's front-side bus. The two chips must coordinate to ensure the sanity of the contents of their respective L2 caches via this bus. That will sometimes mean writing modified data out of one chip's cache into main memory and then reading it back into the other chip's cache—a positively eternal operation in CPU time. Both chips use this same bus to talk with the rest of the system, including main memory and I/O devices. Also, the presence of three electrical loads on the bus—two CPU chips and the core-logic chipset's north bridge—complicates matters.

and from AMD Barcelona preview: http://www.infoworld.com/article/07/...cgd=2007-02-08

Quote:
Unlike Intel’s Core, Barcelona gives each core dedicated L2 cache, and Barcelona incorporates a redesign that reduces cache latency (access delays). Barcelona adds Level 3 cache, a newcomer to the x86 and a page out of IBM’s POWER playbook. All four CPU cores in a Barcelona socket will share a single master catalog of recently-retrieved data. A three-level cache is a must-have for a multicore CPU, and that becomes obvious when you get a demo that switches L3 on and off.

Last fiddled with by ATH on 2007-05-17 at 12:03
ATH is offline   Reply With Quote
Old 2007-05-17, 13:54   #14
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

164448 Posts
Default

Quote:
Originally Posted by fivemack View Post
Well, yes; on the other hand, to buy a Q6600 quad-core processor for $530 now when there is universal consensus that the price will be $266 from July 23rd onwards seems other than completely wise. The unknowable future is one thing, the well-trailed roadmap another.
Time is money. An extra $260 is a pittance when you need the capability NOW.
Some of us have larger price elasticities.

Last fiddled with by WraithX on 2016-02-15 at 05:04
R.D. Silverman is offline   Reply With Quote
Old 2007-05-18, 07:51   #15
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

22·229 Posts
Default

Quote:
Originally Posted by fivemack View Post
'true quad core' is a marketing term invented by AMD. Intel uses two dual-core dice at 65nm, and will use two dual-core dice at 45nm, because it's a little cheaper to manufacture two small dice and put them in an expensive package than to manufacture one large die.

For Prime95-like jobs where there is nothing shared between the four cores, the slightly faster inter-core communications on AMD's chip are irrelevant; generally if you're constrained by inter-core communication latency you've designed your software poorly.
Cruelty has done (15th of April) a very interesting and useful measurements with prime95 v25.2, which is multi-threaded, on a Core2Quad running at 3 GHz, with 2048K and 8192K FFTs.
The figure shows that the processor is not scalable. For 8192 FFT, 2 cores give an improvement factor of: 2, while 4 cores gives: 2.6 , meaning that you buy 2 more cores and get only half of one.
I don't know the impact of the multi-threading of 25.2 compared to 4 instances of prime95, due to 4 threads sharing memory, but it should not be so big.
So, wait for the AMD machine to see if AMD's quad cores is better than Intel's (false) quad core. Or build some performance data with 1, 2, 3 and 4 instances of no multi-threaded prime95.

Tony
Attached Thumbnails
Click image for larger version

Name:	4-Threads.GIF
Views:	783
Size:	6.4 KB
ID:	1745  
T.Rex is offline   Reply With Quote
Old 2007-05-18, 09:59   #16
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

32558 Posts
Default

I have published some benchmarks with two Core2Quad configurations : one with P965 chipset the other with a nVidia 650i chipset. Cruelty's numbers are for a nVidia chipset. In my experience running Prime95 24.14 with an Intel Quad is worth 3 cores. Quad Core and P95 Using the nVidia chipset a Quad is worth 2 cores. All timings relate to LL tests with 1536K FFT.
S485122 is offline   Reply With Quote
Old 2007-05-18, 13:28   #17
Cruelty
 
Cruelty's Avatar
 
May 2005

23·7·29 Posts
Default

Quote:
Originally Posted by S485122 View Post
Cruelty's numbers are for a nVidia chipset.
Not anymore, I have used i975x platforms for those benchmarks, see here, here and here.
Cruelty is offline   Reply With Quote
Old 2007-05-18, 15:00   #18
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

1,709 Posts
Default

Quote:
Originally Posted by Cruelty View Post
Not anymore, I have used i975x platforms for those benchmarks.
Ok, the conclusion is then that a QuadCore is worth 3 cores with the P965 chipset, 2.6 with 975 and only 2 cores with the nVidia chipsets. Since memory is the bottleneck, memory capable of running 2:1 that is 8500 or 1067 MHz is a must .

Last fiddled with by S485122 on 2007-05-18 at 15:02
S485122 is offline   Reply With Quote
Old 2007-05-18, 15:05   #19
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

22×229 Posts
Default

Quote:
Originally Posted by S485122 View Post
Since memory is the bottleneck, memory capable of running 2:1 that is 8500 or 1067 MHz is a must .
Bottleneck is memory bus speed ? or memory controlers (one controler for each core instead of sharing one for 2 or 4 cores) ?
T.
T.Rex is offline   Reply With Quote
Old 2007-05-18, 17:23   #20
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

1,709 Posts
Default

As far as I was able to see the bottleneck is the memory controller or bus, I tried a higher FSB speed, keeping the processor speed the same by changing the multiplier and the iteration times increased.

There is only one memory controller and memory bus for the four cores. I suppose the quad Xeons will have the same problem.

I hope for GIMPS sake that AMD does increase its memory bus enough to provide the data rates Prime95 needs in their comming "true Quad".
S485122 is offline   Reply With Quote
Old 2007-05-18, 21:21   #21
T.Rex
 
T.Rex's Avatar
 
Feb 2004
France

22×229 Posts
Default

Quote:
Originally Posted by S485122 View Post
There is only one memory controller and memory bus for the four cores. I suppose the quad Xeons will have the same problem.
The AMD "Phenom" has: "integrated DDR2 memory controller, a shared L3 cache, the company's HyperTransport technology links, 128-bit floating point units, separate L2s and L1s for each core". I think Intel memory controler is outside the cores.
T.
T.Rex is offline   Reply With Quote
Old 2007-05-18, 22:12   #22
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

DB316 Posts
Default

Pardon me if this is an ignorant question, I'm not totally sure I understand all of the thread. If Prime95 processes a lot of data(though it becomes unimportant a few hundredths of a second later), and therefore has a bottleneck, would it not be better to seek out a worthy low bandwidth project and strategically distribute that project, along with Prime95, among the cores? I'm thinking, since it's basically two dual-cores on a Core 2 quad chip, each dual-core could have one instance of Prime95 and one instance of a different program.
jasong is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Dual Core to Quad Core Upgrade Rodrigo Hardware 6 2010-11-29 18:48
exclude single core from quad core cpu for gimps jippie Information & Answers 7 2009-12-14 22:04
Quad Core Questions... TomYosho Information & Answers 2 2009-09-14 13:01
Quad Core and P95 sgrupp Hardware 54 2008-01-25 22:01
Optimising work for Intel Core 2 Duo or Quad Core S485122 Software 0 2007-05-13 09:15

All times are UTC. The time now is 04:55.


Fri Aug 6 04:55:54 UTC 2021 up 13 days, 23:24, 1 user, load averages: 2.16, 2.43, 2.80

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.