mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2004-03-10, 16:55   #1
pyrodave
 
Sep 2003

2 Posts
Default 64 bit plans

When should we expect an athlon 64/ Opteron optimized version to be out?
pyrodave is offline   Reply With Quote
Old 2004-03-10, 22:11   #2
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

2×33×13 Posts
Default

Sigh. People keep asking about this and the v5 server. A few official sentences from George would be nice. Is there anything to tell?
PrimeCruncher is offline   Reply With Quote
Old 2004-03-10, 22:47   #3
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

Windows or Linux/FreeBSD client ?
32-bit or 64-bit version ?
To run on 32-bit or 64-bit OS ?
dsouza123 is offline   Reply With Quote
Old 2004-03-11, 00:52   #4
pyrodave
 
Sep 2003

2 Posts
Default Re:

I am thinking of getting an Athlon 64 FX and running 64 bit linux on it, maybe switching to windows later, maybe not. Mersenne checking would be one of the most frequent things I can think of that I would do on it.
pyrodave is offline   Reply With Quote
Old 2004-03-11, 02:12   #5
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

3·72·47 Posts
Default

Quote:
Originally Posted by PrimeCruncher
People keep asking about this and the v5 server. A few official sentences from George would be nice. Is there anything to tell?
Scott had been working on v5 server and expects to resume development shortly. I've not done any v5 client work, but will do so when Scott resumes work. I will make v5 a higher priority than any other program improvements.

Xyzzy sent the Opteron three weeks ago. After struggling for two weeks to get AMD Codeanalyst working, I finally hit upon the right combination (with some help from Dresdenboy).

I've been looking at one of prime95's most common macros. This macro takes 102 clocks running on a P4 with data in the L1 cache, 115 clocks with data in L2. On the Opteron it takes 129 clocks and 165 clocks respectively. Codeanalyst shows some stalls in the L1 case (I haven't figured out a way to trace the L2 case). These stalls seem to be somehow related to the load-store unit but I could be wrong. I've not yet found a way to improve on the current 129 clocks.

With x86-64 we will have 8 more SSE2 registers. Perhaps the best use of these registers on the Opteron would be to store some commonly used constants and spread out data stores to reduce pressure on the load-store unit. Lots more analysis is required.

Also, 64-bit tools are not yet prime time. I can't figure out if Microsoft's C compiler for x86-64 is available yet. MASM for 64-bit is available, but there still doesn't seem to be a way to port MASM's object files to Linux yet. These problems will be solved eventually.

Finally, don't expect great things from x86-64. Yes, we can make superb trial factoring code, but that represents only 1% of GIMPS work. I suspect that the extra registers and rework of the assembly code will speed up the LL test by 20% at most.
Prime95 is offline   Reply With Quote
Old 2004-03-11, 13:07   #6
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by Prime95
I've been looking at one of prime95's most common macros. This macro takes 102 clocks running on a P4 with data in the L1 cache, 115 clocks with data in L2. On the Opteron it takes 129 clocks and 165 clocks respectively. Codeanalyst shows some stalls in the L1 case (I haven't figured out a way to trace the L2 case).
One way to solve this could be: access 64k of different data before hitting the trace point in the macro. I don't know how this will behave with cache warmup enabled.
Dresdenboy is offline   Reply With Quote
Old 2004-03-11, 15:57   #7
lycorn
 
lycorn's Avatar
 
Sep 2002
Oeiras, Portugal

22×3×5×23 Posts
Cool

Quote:
Originally Posted by Prime95
,Finally, don't expect great things from x86-64. Yes, we can make superb trial factoring code, but that represents only 1% of GIMPS work.
Well, but that is already something, isnĀ“t it?
And I think that having a "Trial Factoring biased" version of Prime95 (the 64-bit version) would attract many Opteron/Athlon64 owners to contribute to the TF effort, that is also becoming more significant as the TF depth increases with the size of the exponents. The feeling of having machines that smoke P4s in at least a type of work may also be rewarding for many users ...
lycorn is offline   Reply With Quote
Old 2004-03-12, 00:15   #8
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

2×33×13 Posts
Default

Quote:
Originally Posted by lycorn
The feeling of having machines that smoke P4s in at least a type of work may also be rewarding for many users ...
Especially for those here who only do TFs...
PrimeCruncher is offline   Reply With Quote
Old 2004-03-14, 09:17   #9
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

1011010012 Posts
Default

Not only TF could be faster in 64bit mode..
Dresdenboy is offline   Reply With Quote
Old 2004-03-22, 11:36   #10
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5·103 Posts
Default

What about the large memory space? Surely P-1 work would rack up better 'finding a factor ratios' on the AMD Opterons with large memory >4GB . Getting one of these boxes onto GiMPS and able to have that much memory dedicated to prime95 might be more difficult :)

I know the dual P4-Xeon boxes can access large memory space (>4GB) , but the memory access is more like the good ol' days of 16bit address limitations.

-- Craig
nucleon is offline   Reply With Quote
Old 2004-03-23, 14:24   #11
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

I don't know the details, but I believe, that there is a point way before reaching the standard 2-3 GB application memory space barrier, where LL testing will be more useful than P-1. And LL testing, P-1 etc. can be optimized to run faster (in 32bit mode) or even more (in 64bit mode ) without the need to buy more RAM.
Dresdenboy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Raman's plans and questions Raman Cunningham Tables 350 2016-10-10 10:19
What are your CRUS plans? rogue Conjectures 'R Us 35 2013-11-09 09:03
Plans for the end of the world Oddball Lounge 4 2011-04-18 04:06
Further Plans Kosmaj Riesel Prime Search 6 2009-05-20 01:27
Further Plans Kosmaj Riesel Prime Search 6 2006-09-29 22:32

All times are UTC. The time now is 08:02.

Sun Jul 5 08:02:49 UTC 2020 up 102 days, 5:35, 1 user, load averages: 1.57, 1.26, 1.19

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.