![]() |
![]() |
#1 |
33×17 Posts |
![]()
Has anyone notice how abismal the performance the GIMPS client runs of the Opteron or Athlon64? It is about 2-2.5 times slower than an equivalent P4! So much for a broadly optimized x86 client. I suspect it isn't detecting/using SSE2 extensions in these processors. Anyone know if this is the case or if there is a fix in the works?
Thanks, Sean |
![]() |
#2 | |
"Gang aft agley"
Sep 2002
2·1,877 Posts |
![]()
What I know about the state of development of the GIMPS client code for the Opteron, I learned from reading this thread through to the end: Let's buy GIMPS an Opteron! (page 8)
As far as I can tell there are problems at the moment in keeping the FPU fully utilized. Also there have been some problems with certain tools used for timing and profiling. I am certain there is no misunderstanding about the availability of SSE2 extensions in AMD's 64bit processors. This thread has some timing figures for an Athlon 64:http://www.mersenneforum.org/showthr...highlight=SSE2 You will see that the client detected the SSE2 extensions for the CPU Quote:
Tuning the client for maximum performance on these chips may take some time. The GIMPS clients are notoriously well written to utilize CPU resources. It might be that superb coding of the P4 code that makes current Opteron timing look poor by comparison -- IMO --Ross Last fiddled with by only_human on 2003-12-07 at 22:32 |
|
![]() |
![]() |
#3 |
Aug 2002
Texas
5·31 Posts |
![]()
MadMac
The performance of Opteron and Athlon64 chips have been widely discussed on this forum. As for the optimization, there are few programs out there that have the level of optimization that Prime95 has, due to the tireless work of hand tuning by George Woltman. With respect to the Opteron, the GIMPS community has been gracious enough to purchase George such as system to work on optimizations, as he previously did not have access to such a machine. I would suggest performing a search of the forum for threads related to Opterons and Athlons. Last fiddled with by Complex33 on 2003-12-07 at 22:36 |
![]() |
![]() |
#4 |
Oct 2002
Lost in the hills of Iowa
1110000002 Posts |
![]()
The reason the Athlon performs relatively poorly is that the SSE2 enhancements to the P4 are almost tailor-made to GIMPS usage. It's one of the VERY FEW places where the P4 does better per Mhz than the Athlon. Note that the Athlon does not and never has and never will support SSE2.
The Opteron does support SSE2, but it falls behind the P4 due to clock speed - the fastest Operon is about 40% slower than the fastest P4, and a little less slow on Prime - this gap might narrow once George gets the time to add optimizations for the Opteron, but the clock speed will still be a handicap unless AMD manages to ramp it up faster than Intel has managed to ramp up P4 clock speeds. Narrow-focus distributed projects tend to have wide performance variations, *especially* those that are well-optimised. They tend to rely on a VERY NARROW subset of the instruction set of a modern processor, which different processors impliment way differently. In example, the Distributed.Net RC5 client relies VERY heavily on a specific rotate instruction - the Athlon happens to have that rotate set up in hardware, the P4 uses microcode for the same instruction, which makes the Athlon a lot faster on RC5. In general usage, though, that particular instruction is very little used. For Prime usage, the advantage of SSE2 is the more numerous floating point registers coupled with the ability to perform actions on more registers at one time, which makes the FFT work Prime does a lot faster. The Athlon is faster at a lot of FP work, but if the limits of SSE2 can be worked within, SSE2 blows away Athlon FP performance. If the AltiVec unit in the G4/G5/recent Power PC cpus handled double-precision FP work, *that* CPU would be massively faster on Prime work (presuming an AltiVec-enabled client existsd) than even the P4. |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
In ggnfs lasieve4 experimental, what's the difference between the athlon64 and x64 folders? | Dubslow | Factoring | 3 | 2016-10-12 10:58 |
Help needed to test Athlon64 code | geoff | Programming | 7 | 2006-08-18 12:16 |
Let's buy GIMPS an Opteron! | Xyzzy | Lounge | 264 | 2006-08-17 12:39 |
Athlon64 support? | JuanTutors | Software | 1 | 2004-06-04 02:46 |
AMD Opteron | naclosagc | Software | 27 | 2003-08-10 19:14 |