mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Prescott impact to prime95. (https://www.mersenneforum.org/showthread.php?t=769)

nucleon 2003-07-02 16:05

Prescott impact to prime95.
 
Guys,

Thought I start a thread on the new intel processor core coming up. (My apologies if this has been brought up earlier)

Anyone have any tech details on the new processor? I've heard that there are new instructions codenamed "PNI", and they represent marketting speak such as "to accelerate video encoding and improve thread synchronization" (taken from tom's hardware guide).

Also heard the L2 cache, and L1 data caches are getting doubled, and core frequency is going to be 3.4-5GHz.

Anyone have any decent web pages on the prescott core?

-- Craig

TauCeti 2003-07-02 16:35

I think:

http://www.chip-architect.com/
http://www.hwextreme.com/articles/prescott/
http://cedar.intel.com/media/pdf/pniwithopcodes.pdf

have some good information.

E_tron 2003-07-02 19:05

I read about Prescott Core about a few months ago and apparently this PNI code was formally called SSE3 in many of the articles I looked at :? . I think we will start to see it as the Pentium V in Octoberish? Lets not forget the Athlon64. It could pack a wallop as well 8) .

Maybe we should all e-mail Intel and get George one of these Prescott CPU’s ASAP (as to not waist the new optimizations :D ).

Prime95 2003-07-02 20:02

I looked at the PNI instructions several months ago and did not see anything prime95 could use :(

QuintLeo 2003-07-03 13:26

"SSE3" is mostly marketing hype on the part of Intel - it does NOT add much over SSE2, and seems to mostly be a responce to the addition of SSE2 support to the Opteron and Athlon64 AMD cpu lines.


I'm going to be REAL curious to see if Athlon64 CPU clock speeds ramp up quickly - for Prime work, they'll need to get a LOT closer to P-IV clock speed for the SSE2 support to help much. For everything else, though, looks like Opteron is ALREADY pretty competative - and when the 64-bit native code applications start rolling in, Itanic is gonna be pretty much a dead issue (it's ALREADY hurting badly)....

ColdFury 2003-07-03 19:22

[quote]looks like Opteron is ALREADY pretty competative - and when the 64-bit native code applications start rolling in, Itanic is gonna be pretty much a dead issue (it's ALREADY hurting badly)....[/quote]

Itanium isn't meant to compete with Opteron, more with IBM's Power series. Besides, the newest Itanium just announced has the highest SPEC score in FP.

QuintLeo 2003-07-05 02:51

Itanium is older than Opteron - but they've been failing as it is to compete against Power and Sun.

Opteron is just most likely to be the CPU series that finally sinks the Itanic.

And yes, the higher-end Opterons ARE in fact targeted at the Itanium - and already beat it on a lot of things, even though the Itanic is in it's 3'd family revision already....

ebx 2003-07-05 08:37

[quote="QuintLeo"]Itanium is older than Opteron - but they've been failing as it is to compete against Power and Sun.

Opteron is just most likely to be the CPU series that finally sinks the Itanic.
[/quote]

Are you sure? Itanium is argueably behind Power4 and way ahead of Sun and other 64 bit cpu.

If Opteron is ever going to sink i64, it is becasue of its backward compatibility and maybe prices. Never performance. AMD does not even put Opteron up to I64. All their comparisions are against Xeon.

QuintLeo 2003-07-06 06:11

Other folks have compared Opteron to Itanium - IIRC Opteron wins on integer, Itanium wins on floting point - might be other way around - when clocked the same.

As it looks like Opteron is going to be able to ramp up speeds a LOT faster than Itanium has managed, well, I leave the math to you.


And Sun has sold more 64-bit machines IN ANY YEAR (other than the first year they sold them, PERHAPS) than Intel has managed TOTAL for ALL GENERATIONS of the Itanic combined.

Itanium is NOT succeeding against Sun by any measure - and Power4 (much less Power5) has been stomping it as well in sales.

ebx 2003-07-06 18:47

Market success and techinique success are seperate things. There are examples that advanced tech didnt translated into market sucess, like the alpha chips. Opteron may sell well but that doesnt change the fact that Itanium is FAR suporior. Carrying 32 bit luggage isnt all that great when the focus shifts to archetecture from marketing.

Itanium doesnt come out at higher clocks is largely due to the market condition. I bet if Opteron runs with Itanium head on head and the market is there, we will see faster Itanium sooner than later, just like what we have observed on p4/althron.

TauCeti 2003-07-06 20:37

The Itanium-2 is a brute with his L3-cronies. Besides the largely anticipated slow transformation of software to the itanic EPIC architecture, the Itanium-systems have imho one disadvantage worth mentioning: it's the low memory bandwith in multi-processor systems. All Itaniums in a multi-CPU block up to 4/8-way nowadays share one system-bus to access the main-memory.

This disadvantage is alleviated by the usage of Itaniums _large_ L3-caches and also can be improved with new external chipsets.

With Opterons capability to scale to glueless 8-way systems (err - AMD _did_ mention that once, even if they now only talk about 4-ways) with every CPU having a dedicated memory and with the low latency Hypertransport connections between the CPUs, i _can_ imagine workloads with huge datasets where the Itanium architecture is _not_ competitive with the Opterons in cluster environements.

So i really hope that AMD succeeds (that means: survives at all) with the Opteron and Athlon64 and stays competitive.

ColdFury 2003-07-06 21:02

[quote]This disadvantage is alleviated by the usage of Itaniums _large_ L3-caches[/quote]

How is this different than Power4's 128 MB L3?

TauCeti 2003-07-07 15:17

[quote="ColdFury"][quote]This disadvantage is alleviated by the usage of Itaniums _large_ L3-caches[/quote]

How is this different than Power4's 128 MB L3?[/quote]

Each power4 processor has 2 cores sharing the L2 cache. 4 processors (8 cores) form a module. The 128MB L3 is used equally by all of the CPUs on that module.

power4 L3 bandwitdh is 'only' about 10GB/s (off-chip) compared to the 32GB/s of the itanium-2 (on chip L3)

Looking at SPECfp2000 and SPECint2000 the 6 MB L3 Itanium-2 performs actually better compared to a power4 in 'fp' and only slightly worse in 'int'.

Example:

SGI Altix 3000 Itanium-2 1.5GHz 6MB L3 Specfp2000:2055 SPECint2000:1077

IBM eServer pSeries 690 Turbo power4 1.7GHz SPECfp2000:1699, SPECint2000:1113

nomadicus 2003-07-07 20:43

[quote="TauCeti"]
Looking at SPECfp2000 and SPECint2000 the 6 MB L3 Itanium-2 performs actually better compared to a power4 in 'fp' and only slightly worse in 'int'.

Example:

SGI Altix 3000 Itanium-2 1.5GHz 6MB L3 Specfp2000:2055 SPECint2000:1077[/quote]

Intel says that they will achieve parity speeds with the Alpha chip in 2006 (give or take a year), but the latest ev7 1.1GHz Alpha chip (www.specbench.org) is rated
Specfp2000:1482
Specint2000:877

What's the deal? I take it that the SGI numbers correct? So I am wondering if I got bad info. or am I misinterpreting these numbers ?
john

ColdFury 2003-07-07 20:58

Actually, I've seen the SPECint of the new itanium reported as high as 1300. Are those base or optimized scores?

[quote]Intel says that they will achieve parity speeds with the Alpha chip in 2006[/quote]

I don't recall any type of statement like that. Many chips were already competitive with Alpha when Intel bought the IP and such.

QuintLeo 2003-07-08 01:59

OVer 2 years to achieve parity with a chip that has already had announced that it's development is pretty much dead?

Real impressive..... *NOT*



I do sometimes regret the death of DEC....

ColdFury 2003-07-08 04:03

[quote]OVer 2 years to achieve parity with a chip that has already had announced that it's development is pretty much dead?
[/quote]

I'm confused, which chip do you mean? All the front line chips have surpassed the last Alpha chip by now.

TauCeti 2003-07-08 05:55

[quote="ColdFury"]Actually, I've seen the SPECint of the new itanium reported as high as 1300. Are those base or optimized scores?[/quote]

For the SGI system the base and peak values for SPECint are identical. Hmmm. Strange. For the power4 i posted the peak values. So the Itanium lead is even stronger. The lead also grows for tests with 8,16 or 64 CPU-Systems. LoL - i did not know the new Itanium-2 is _that_ fast.

I got the data here:

http://www.spec.org/osg/cpu2000/results/res2003q3/cpu2000-20030616-02232.html
http://www.spec.org/osg/cpu2000/results/res2003q3/cpu2000-20030616-02227.html
http://www.sgi.com/newsroom/press_releases/2003/june/altix_benchmarks.pdf

nomadicus 2003-07-08 12:52

[quote="ColdFury"]I'm confused, which chip do you mean?[/quote]
the 21364

[quote="QuintLeo"]OVer 2 years to achieve parity with a chip that has already had announced that it's development is pretty much dead?

Real impressive..... *NOT*[/quote]
Yep. I knew I shouldn't have listened to the sales types.

[quote="QuintLeo"]I do sometimes regret the death of DEC....[/quote]
Me too . . .

ebx 2003-07-08 19:52

Freshly out: CPU with 64bit Architecture: Evolution or Revolution?
Goes back to intel 4004 days.

http://www.xbitlabs.com/articles/cpu/display/64bit.html

...

The major question, however, is not about the technical characteristics of the processors, but about the power behind them: are they ready to ensure strong support? From this point of view and also due to their technical potential, Itanium and Power4+ will definitely be among the leaders one day. The future of Athlon 64 and Opteron is not quite clear. Everything depends on their ability to use the 64bit potential to the full extent, which means that they will need 64bit operation systems and mass applications. And the situation here is far from good today. However, this is not at all surprising: it took Microsoft about 10 years to shift from 16bit to 32bit.

nomadicus 2003-07-09 15:28

[quote="ebx"]Freshly out: CPU with 64bit Architecture: Evolution or Revolution?
Goes back to intel 4004 days.

http://www.xbitlabs.com/articles/cpu/display/64bit.html[/quote]

I find it quite interesting that he notes the difference between servers/high-end workstations and the desktop. Servers needs to efficiently address beyond 32 bits. Does the desktop need that? Probably, a ways into the future. With Intel abandoning the x86 architecture (well, not really, they will probably make P4's for a long time) and AMD extending it, seems like to me, AMD could could survive at the mass market desktop level.
Intel needs to compete at the 32bit level with AMD (couldn't AMD put SSE2 into the Athlon XP? No. I see AMD as a 64bit addressing processor running mainly 32bit apps and including SSE2, etc.). Intel has to compete with Power4+ at the 64bit level (both a completely different architectures from each other and x86). Depends on what Microshaft does though. 64bit support from that company will tip the scales :mad:

nucleon 2003-07-10 14:30

Isn't there a 64bit WinXP floating around?

But I think the AMD64 platform, is more like a 48bit addressing processor. :) But I liked the way you phrased the expression :)

I think in the near future, consumer based processors will need to be able to access >4GB memory in around 2 year time frame. (yes pulling those stats from where the sun isn't shining)

Again it's games pushing the envelope more than anything. Planetside a MMOFPS, was running on my machine using 700+ MB. There was a big difference for me going from 512MB to 1024MB physical memory in the machine. I can very well imagine games 2+ years from now needing more than double that memory.

I think currently ram prices are stalling the push to be able to access a larger ram space. I think 1GB DIMMS are too expensive for consumer grade PCs at the momment.

ColdFury 2003-07-10 20:27

[quote]Isn't there a 64bit WinXP floating around?
[/quote]

I believe there's a version of Windows 2003 for IA-64 available from Microsoft. It's one of those things you that are available "on request only" I believe.


[quote]But I think the AMD64 platform, is more like a 48bit addressing processor. [/quote]

Pretty much all processors are like that. Being able to address the full 64-bit memory space would create page tables that are much too unwieldy. Besides, no application is going to coinceivably need that much memory space. It's not a problem because the processor should be able to extend the virtual space in later processors transparently.

wfzelle 2003-07-10 21:02

[quote="nucleon"]But I think the AMD64 platform, is more like a 48bit addressing processor. :)[/quote]
True, but it's no big deal since you can't realistically use more than 282 terabytes anyway (for the time being). With a doubling of memory every 18 months, we should last for another 24 years. I'd rather not pay more for extra pins and increased core sizes that won't be needed until 2028.

BTW, here you can find an article about x86-64 and 64-bit computing in general: [url]http://arstechnica.com/cpu/03q1/x86-64/x86-64-1.html[/url]

wfzelle 2003-07-10 21:16

[quote="nomadicus"]I find it quite interesting that he notes the difference between servers/high-end workstations and the desktop. Servers needs to efficiently address beyond 32 bits. Does the desktop need that?[/quote]

I don't think that 1GB is that much anymore. Some prosumers are already pushing the 64-bit boundary (video editing, photoshopping, etc). It's better to move to 64-bit now and have a gentle transition (instead of seeing all kinds of hacks to 32-bit apps).

Besides, the biggest advantage of x86-64 is not that it is 64-bit. The ability to fix some of the flaws of the x86 ISA (ie. few available registers) has a big effect on performance.

ebx 2003-07-11 04:53

[quote="wfzelle"]
Besides, the biggest advantage of x86-64 is not that it is 64-bit. The ability to fix some of the flaws of the x86 ISA (ie. few available registers) has a big effect on performance.[/quote]

Why would it need a 64 bit processor to fix the shortage of registers?

wfzelle 2003-07-11 13:43

[quote="ebx"][quote="wfzelle"]
Besides, the biggest advantage of x86-64 is not that it is 64-bit. The ability to fix some of the flaws of the x86 ISA (ie. few available registers) has a big effect on performance.[/quote]

Why would it need a 64 bit processor to fix the shortage of registers?[/quote]
Because you need to convince developers to recompile their applications (possibly having to clean up their code) and ship multiple binaries. A fairly small player like AMD needs both the 64-bit fans* and the performance freaks to make x86-64 a success. The bigger the improvement, the greater is the chance that it will be adopted. It's telling that we've had to deal with very few registers for this long. It's clear that it is very difficult to achieve ISA improvements.

*Mostly server, workstation and (beowulf) cluster folk.

ebx 2003-07-11 16:24

So it is not really a x86-64 advantage. AMD is merely using this chance to add more registers.

Yea IA32 is hard to improve. That is why intel defines an all new IA64 architecture.

ColdFury 2003-07-11 21:13

I really wish AMD moved to a 3 operand format with x86-64, but I guess that would have required too much redesign of the decoders.

S3SJK 2003-07-12 09:02

Out of interest what performance advantages would that give?

nomadicus 2003-07-12 15:19

Three operands would combine two instructions into one.
Think of doing c=a+b
A two operand format would have to temporarily put the result (a+b) in a register, then execute another instruction to store it into c. A three operand format would perform the addition (a+b) and place the result straight into c.
I've over simplified the example being as different chip architectures (I was thinking of the VAX chip with its three operand format) will do it differently, but I hope you get the idea.
A three operand format is more complex than two which can a good thing or a bad thing depending on the chip's architecure goals.

ebx 2003-07-13 00:00

[quote="ColdFury"]I really wish AMD moved to a 3 operand format with x86-64, but I guess that would have required too much redesign of the decoders.[/quote]

If they did, it would not have been x86 any more.

ewmayer 2003-07-14 21:12

[quote="nomadicus"][quote="TauCeti"]
Looking at SPECfp2000 and SPECint2000 the 6 MB L3 Itanium-2 performs actually better compared to a power4 in 'fp' and only slightly worse in 'int'.

Example:

SGI Altix 3000 Itanium-2 1.5GHz 6MB L3 Specfp2000:2055 SPECint2000:1077[/quote]

Intel says that they will achieve parity speeds with the Alpha chip in 2006 (give or take a year), but the latest ev7 1.1GHz Alpha chip (www.specbench.org) is rated
Specfp2000:1482
Specint2000:877

What's the deal? I take it that the SGI numbers correct? So I am wondering if I got bad info. or am I misinterpreting these numbers ?
john[/quote]

Perhaps Intel is referring to parity in a per-clock-cycle sense.

Dresdenboy 2003-07-21 06:50

[quote="ColdFury"]I really wish AMD moved to a 3 operand format with x86-64, but I guess that would have required too much redesign of the decoders.[/quote]

You know it was planned (maybe with even more regs than 16 XMM regs) and I don't think it would have been more complex to decode than SSE2. Maybe the Athlon/Opteron RISC cores already use such instruction format for their 88 register file. That would have made it very easy to decode. But it's better to make available applications faster (with optimized SSE2 code and poor x87 code) than requiring a recompile and 64bit mode to run.

[quote="S3SJK"]Out of interest what performance advantages would that give?[/quote]

We want to calculate a=x²+4x
Just look at this pseudo code (for a risc like architectures like the execution cores of x86 CPUs):
[code:1]// format is instruction dest, source
fpload r0, [x]
fpload r1, [const_4]
fpmov r2, r0 ;we need it later again
fpmul r0, r0 ; x²
fpmul r2, r1 ; 4x
fpadd r0, r2 ; x²+4x
fpstore [a], r0
[/code:1]

With 3 operands it would maybe look like this:
[code:1]// format is instruction dest, source1, source2
fpload r0, [x]
fpload r1, [const_4]
fpmul r2, r0, r0 ; x²
fpmul r1, r0, r1 ; 4x
fpadd r3, r1, r2 ; x²+4x
fpstore [a], r3
[/code:1]

We saved one instruction in this simple calculation. But if you look at complex SSE2 or also x87 code you'll see a lot of shuffling, moving and saving registers (that they don't get destroyed all the time). While x86 CPUs have to move and save, the other CPUs (Alpha, Power, even G5) continue to calculate.

DDB

ebx 2003-07-22 03:39

I got a better compiler for a=x²+4x:

// format is instruction dest, source
fpload r0, [x]
fpmov r1, r0 ;we need it later again
fpadd r0, [const_4] ;x+4
fpmul r1, r0 ; x²+4x
fpstore [a], r1

Moving data between regs is the fastest instruction. Cant compare to fpmul. Load/Store memory is more than one instruction usually. That further brings down the weight of fpmov.

2 operand vs 3 operand is a long debate. There isnt any clear winner.

Dresdenboy 2003-07-22 06:02

[quote="ebx"]I got a better compiler for a=x²+4x:

// format is instruction dest, source
fpload r0, [x]
fpmov r1, r0 ;we need it later again
fpadd r0, [const_4] ;x+4
fpmul r1, r0 ; x²+4x
fpstore [a], r1

Moving data between regs is the fastest instruction. Cant compare to fpmul. Load/Store memory is more than one instruction usually. That further brings down the weight of fpmov.

2 operand vs 3 operand is a long debate. There isnt any clear winner.[/quote]

You are right. I didn't optimize my code, just wanted to show the difference. And because I was thinking about a RISC architecture by creating this example, I also didn't count on memory operands for fpadd/fpmul.

At least by using a simple adressing mode the load/store can be handled easily by the hardware.

IMO the advantage of 3 operand instructions is, that you may use a different destination register or just one of the sources - what fits best for the algorithm, while with 2 operands you are always required to overwrite one of the sources. And the disadvantage is, that the opcode needs additional bits for adressing the third register.

DDB


All times are UTC. The time now is 01:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.