![]() |
|
|
#34 |
|
"Mike"
Aug 2002
200528 Posts |
I have a 1100MHz Duron that is idle... I can't justify the increased heat and power consumption from running GIMPS on it, given it is so slow on the project...
That said, I understand why this is, and it isn't because of some conspiracy... The older AMDs have a slow clock speed and no SSE2... If I had an 1100MHz P4 or Celeron with a P4 core, I'd be running GIMPS... You can't make a silk purse from a sow's ear... (That doesn't mean the AMD sucks... It just sucks at this particular application when compared to a P4!) |
|
|
|
|
|
#35 | |
|
Banned
"Luigi"
Aug 2002
Team Italia
113228 Posts |
Quote:
Luigi |
|
|
|
|
|
|
#36 | |
|
Sep 2003
Borg HQ, Delta Quadrant
2·33·13 Posts |
Quote:
Summary.txt isn't available this hour, but based on my recollections from yesterday, Athlons STILL make up about 41% of all computers on GIMPS. I'd say any further optimizations at this point are worth the effort. If the optimizations had been started when the thread began, they could already be finished and we'd be reaping the benefits right now. Nobody I know has an Opteron or an Athlon 64. Contrarily, I have a Duron, and two of my friends have Athlons. Opteron and Athlon 64 are both, IMO, still too expensive to justify buying them to the average consumer. 64-bits isn't a necessity yet; in fact, 99% of people don't even need it. I have yet to see a 64-bit application. Conclusion: Are the Opteron optimizations even done yet? If not, it seems to me we've wasted a whole bunch of time, because they have yet to make a major impact and we've accomplished nothing, while we could've been optimizing for Athlon/Duron and be reaping the benefits already. EDIT: And a Duron 1100 is plenty fast for GIMPS. I have a Duron 950 overclocked to 1017 that does about one TF every 3 days and some-odd hours. Last fiddled with by PrimeCruncher on 2004-08-25 at 14:24 |
|
|
|
|
|
|
#37 | |
|
Oct 2002
2·13 Posts |
Quote:
I see lots of frustrated AMD prime95 contributors being told they have obsolete systems and go buy a P4. Also, can you or anyone please explain the LACK of support from GIMPS for the Opteron/AMD64 platform. AMD64 is not "sow's ear technology". Remember the original gimps code for the P4 made the P4 (sow's ear) look really bad. Then guess what an optimized version was released and made the P4 (silk purse) look great. I am still waiting to see that an actual effort is being done for the AMD64 platform (and not just lip service) . SALEM |
|
|
|
|
|
|
#38 |
|
Feb 2003
11101102 Posts |
After finishing the current assignment, my Athlon 2400+ will stop crunching until there will be a better version of Prime95 for it. If it's not worthy or it can't be done, I'll just look for GF primes, where I get about the same speed/MHz as a P4.
|
|
|
|
|
|
#39 |
|
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
981810 Posts |
Might I suggest that hurricane Charly may have slowed the optimization a bit.
I am happy that my Ath 1.1 can contribute to something larger than myself. |
|
|
|
|
|
#40 | |
|
Mar 2003
Braunschweig, Germany
2×113 Posts |
Quote:
Quote (George): "NO. The P4 and Athlon 64 have the same peak theoretical FPU throughput per clock cycle. The faster clocking of the P4 makes it the clear winner for now. This observation applies to prime95 only." So, please explain what further optimizations you have in mind for the LL-phase that have not been considered until today? Don't get me wrong. The AMD64-Systems are superb! I sold FSC SCENIC O Systems (those systems always sold out but with no advertisement at all) with an Athlon 64 3200+ to a customer. 42 Watts (effective) measured idle Total(!) with XP Pro and Cool'n'Quite. Insane performance with Office Applications and average power consumption below 55 Watts with Office Applictaion running 8hr/day. Show me a Intel 5XX or Celery even close to that. But GIMPS is optimized for SSE2 because that instruction set is best suited for the workload. And in this 'special' case the P4 is faster. It's the wet dream for Intel PR dudes but - nonetheless - it's reality after they wake up in their damp sheets. Tau |
|
|
|
|
|
|
#41 | |
|
Sep 2003
Borg HQ, Delta Quadrant
2BE16 Posts |
Quote:
And have the Opteron optimizations been finished yet? The last I remember, they weren't. GIMPS optimized for SSE2... well, the last time I checked, Opteron has that too. |
|
|
|
|
|
|
#42 |
|
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
2×4,909 Posts |
I was suggesting that if he was near to a release, and if he was more concerned for a while with 'the real world', a delay is understandable.
|
|
|
|
|
|
#43 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
An update is in order.
My investigations into the Athlon-64 running 32-bit code show that it isn't as good as a P4 on a per-clock basis even though they have the same peak theoretical throughput. For example, one of the fundamental FFT building blocks is a macro that takes four complex values and does two levels of the FFT. The P4 can do this in (these numbers are from memory) 95 clocks if the data is in the L1 cache and 105 clocks if in the L2 cache. The Opteron takes 125 clocks and 170 clocks. All attempts I've made at reducing this difference significantly have failed. There is some architectural difference between the two CPUs that favors the P4 when running prime95. Quite frankly, I've run out of ideas. Prime95 *might* be made faster on the Opteron by going to a 4-passes-over-main memory scheme instead of the current 2 passes in an attempt to have more operations run out of the L1 cache instead of the L2 cache. This is a major bit of work and could actually run slower. What development has been happening the last few months? Well, for GIMPSers not much. However, for P4 users doing Proth searches, Seventeen-or-bust, riesel sieve, LLR, etc, the FFT routines have been generalized for these numbers and are significantly faster. What will happen the next few months is that these generalizations will also be made to the x87 code. At this time I'm planning on deleting the "plain pentium" optimized FFTs. The code will be optimized for both the 32 byte P3 cache line as well as the 64-byte Athlon cache line. Since my P3 machine died any tuning of this code will be done on an Athlon. Athlons might see a 10% boost on current FFT ranges. |
|
|
|
|
|
#44 | |
|
Banned
"Luigi"
Aug 2002
Team Italia
2×3×11×73 Posts |
Quote:
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Do normal adults give themselves an allowance? (...to fast or not to fast - there is no question!) | jasong | jasong | 35 | 2016-12-11 00:57 |
| benchmarks over-clock definition? | lfm | PrimeNet | 4 | 2009-11-15 00:43 |
| Clock Problems | R.D. Silverman | Puzzles | 5 | 2006-12-13 00:29 |
| The Clock Problem | davar55 | Puzzles | 9 | 2006-05-26 01:53 |
| Alarm Clock | JuanTutors | Lounge | 2 | 2004-06-21 09:39 |