![]() |
|
|
#1 |
|
1011011001112 Posts |
Is there any performace advantage from running on an Opteron? Would the software have to be rewritten or recompiled to get an advanatge from the 64 bitness? Are there plans to do this? Thanks.
Andy |
|
|
|
#2 |
|
Banned
"Luigi"
Aug 2002
Team Italia
32·5·107 Posts |
Not yet to all your questions :-)
Luigi |
|
|
|
|
|
#3 |
|
Sep 2002
12816 Posts |
as far as ive heard
no becuase the "64 bitness" doenst really affect prime that much beucase it relies on floating point opperations it may help indirectly becuase other things that do take advantage of the 64 bit cpu will reqire less of its time ( like the os and other apps) leaving more time for prime |
|
|
|
|
|
#4 |
|
Oct 2002
Lost in the hills of Iowa
7008 Posts |
George does not have access to an Opteron to do any enhancements for it.
Based on existing benchmark testing I've seen around the web, the existing SSE2 code DOES work well with an Opteron, but is still somewhat slower than a P-IV due to relative clock rates, though fairly close *per* clock despite not having specifically Opteron-optimised code anywhere. George has indicated that there might be some small gains to make from optimizing the cache usage, but doesn't think it would make a BIG difference. Same answers for the Athlon64, since it's not yet released and AMD doesn't appear to be sending samples (yet) to anyone other than big manufacturers (esp. motherboard makers). |
|
|
|
|
|
#5 |
|
Sep 2002
2·331 Posts |
Running an Opteron vs an Athlon on a 32-bit OS, there should be a few advantages.
SSE2 code will work. Even though the extra general purpose registers available to 64-bit code aren't available, internally there should be more resources/temporary registers to work with. Each assembly level instruction is broken down into many smaller internal instructions which take advantage of all available hardware resources ( hidden registers, running multiple internal instructions in parallel, etc ). There is available extra unused hardware, needed for the extra general purpose registers, which should allow better register management/ register renaming. The L2 cache is larger, an Athlon has 256k, (exception some old slower 900 Mhz and less had 512k). The Opteron can have 512k or 1024k. So less cache misses. Also the cache bus width is double. |
|
|
|
|
|
#6 |
|
Aug 2002
North San Diego County
5·137 Posts |
There is an Opteron SMP article up at AMDMB.COM that has some Prime95 benchmarks. I don't know what version they tested with, but the benchmark results only go up to 1792K.
Selected benchmark figures (I chose the highest numbers listed for each FFT size): [code:1] 256K - 16.628 512K - 35.543 1024K - 78.184 1792K - 149.181[/code:1] Which puts the Opteron 244 (1.8GHz) at about equal to a 1.6Ghz-ish P4. While they cast Prime95 in an interesting light - "Prime 95 is a benchmark used to find Mersenne Prime numbers" - at least they provided a link to www.mersenne.org! :) |
|
|
|
|
|
#7 |
|
Jan 2003
7×29 Posts |
If I'm not mistaken, the FPU of the Opteron has 3 modes. 1 is the normal mode, another is a hybrid mode and the last is a completely new way of doing things.
I'd expect that the classic FPU mode will be little better than an Athlon (perhaps extra cache, SSE, more FSB/memory bandwidth, and lower latency will help a little). But what about the other 2 modes? Won't these avoid the antiquated floating point stacks of the x86 architecture? Understandably, this will require a lot of re-coding as a downside. |
|
|
|
|
|
#8 |
|
Sep 2002
2·331 Posts |
I believe you are mistaken about the FPU in the Opteron.
It is the same as the Athlon. The CPU/general instruction set is what has 3 modes. 32-Bit mode uses the same instruction set as an Athlon ( plus SSE2 ) Needs 32-bit OS and 32-bit programs. (use existing programs) 64-Bit mode uses the new instruction set, has extra registers, the registers are 64-bit. ( has SSE2 ) Needs 64-bit OS and 64-bit programs. (recompile or new 64-bit programs) mixed 32/64-bit can run programs using either instruction set. (has SSE2) Needs 64-bit OS ( not sure ) and 32-bit or 64-bit programs. (32-bit are existing programs) |
|
|
|
|
|
#9 | |
|
Apr 2003
Berlin, Germany
5518 Posts |
Quote:
The number of internal registers depends on the max. possible load of operations. The Opteron has two new pipeline stages to arrange incoming code in a better way for decode and issueing. Currently there are some code alignment and instruction reordering optimizations necessary (by hand or by compiler) to create an optimal instruction stream. The bigger L2 cache won't help much since mem interface is much faster (max. 6.4GB/s vs. 3.2GB/s with 40-60% higher latency on Athlon) and the cache hitrate of prime95 is already very good (over 97% on my AXP with 256kB L2 AFAIR). But there are surely possible optimizations by changing code which is currently optimized for P4. There are some things which P4 does fast and Opteron and vice versa. |
|
|
|
|
|
|
#10 |
|
Aug 2002
2×101 Posts |
The P4 was a dog until George managed to get one in order to do development on it. So I think it is too early to count the Opteron/Athlon64 out.
|
|
|
|
|
|
#11 | |
|
Apr 2003
Berlin, Germany
192 Posts |
Quote:
On Opteron x87 is really fast and SSE2 not really faster ;) (it can't do more adds and muls per cycle than x87). But SSE2 has other advantages then - you can use the registers freely for calculation (no need to use the first as one operand) and in 64bit mode you have 16 SSE2 regs instead of 8. |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Let's buy GIMPS an Opteron! | Xyzzy | Lounge | 264 | 2006-08-17 12:39 |
| AMD Athlon 64 vs AMD Opteron for ecm | thomasn | Factoring | 6 | 2004-11-08 13:25 |
| Opteron web server... | Xyzzy | Lounge | 14 | 2003-11-05 23:07 |
| Opteron Bottleneck?? | Prime95 | Hardware | 31 | 2003-09-17 06:54 |
| What will an AMD Opteron be classified as ? | dsouza123 | Software | 4 | 2003-08-02 14:29 |