![]() |
|
|
#1 |
|
Nov 2012
1810 Posts |
Hi guys,
Since we found out that Linux isn't using the AMD processors and the shared memory efficiently on the machines I use to make P-1, I consider switching to OpenSolaris 11.1. Does anyone know, how I could make my P-1 on OpenSolaris? |
|
|
|
|
|
#2 |
|
Sep 2009
22·5·72 Posts |
What models of AMD processor, chipset and memory sticks ?
|
|
|
|
|
|
#3 |
|
Nov 2012
100102 Posts |
4x AMD Opteron 6176 (4x 12 cores)
AMD SR5690/SR5670/SP5100 Chipset 4x 8x4 Go DDR3 Registered ECC (Quad Channel) ; in theory this should be either Kingston DDR3-1600 or DDR3-1333 |
|
|
|
|
|
#4 |
|
Sep 2009
22·5·72 Posts |
That's an interesting system
![]() Which version of the Linux kernel and which distribution, BTW ? Linux powers ~95% of the world's most-powerful super-computers (and people are not necessarily using no-fee Linux distros on them, so the cost advantage does not necessarily hold), so workloads on which Linux does a bad job are supposed to be relatively infrequent. Does GMP-ECM build on OpenSolaris ? |
|
|
|
|
|
#5 |
|
Nov 2012
2·32 Posts |
Hey, we are also very suprised that Linux is doing a bad job here (we have a lot a Intel systems working with no problem); but believe me it does.
Kernel version is 2.6.32-131.17.1.el6.x86_64 (SL6.1) but we even tried with 3.2, no changes... Actually we have more than only one of such system: we installed OpenSolaris 11.1 on one of them, factor 1.6-1.7 speed improvement on Pi calculation. GMP can be compiled on it, so I think that GMP-ECM should also... I will try to compile mprime on it... I think it should work!? |
|
|
|
|
|
#6 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2·7·461 Posts |
I would guess that the substantial performance changes are a result of Solaris having better default NUMA handling: have you tried playing around with numactl under linux?
I had problems with the kernel moving jobs away from the processor that their memory is attached to; starting them with 'taskset -c X numactl -l' ensures that the job stays on processor X and has its memory allocated from processor X's pool, which can help quite a bit. 'numactl -i 0-7' will allocate memory interleaved across all eight memory controllers, which may be better for jobs that want to use lots of memory from a single thread. |
|
|
|
|
|
#7 |
|
∂2ω=0
Sep 2002
República de California
267548 Posts |
Just out of curiosity, what are your compiler options under OpenSolaris - GCC-only or also SunStudio?
|
|
|
|
|
|
#8 | |
|
Nov 2012
2×32 Posts |
Quote:
But... Since I have 4 CPUS (with quad channel) on each: why eight memory controllers? I use(d) SunStudio, with -fast -library=sunperf -xipo=2 -xtarget=barcelona |
|
|
|
|
|
|
#9 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
On a Magny Cours system you don't quite have four CPUs with quad-channel memory controllers; you have eight CPUs with dual-channel memory controllers, packaged two to a socket.
If you do 'numactl --hardware' then you get the node distances table Code:
node distances: node 0 1 2 3 4 5 6 7 0: 10 16 16 22 16 22 16 22 1: 16 10 22 16 22 16 22 16 2: 16 22 10 16 16 22 16 22 3: 22 16 16 10 22 16 22 16 4: 16 22 16 22 10 16 16 22 5: 22 16 22 16 16 10 22 16 6: 16 22 16 22 16 22 10 16 7: 22 16 22 16 22 16 16 10 |
|
|
|
|
|
#10 | |
|
Nov 2012
2·32 Posts |
Quote:
I thought the "package" was seen as one processor. Hmmm... Thanks a lot! Is node distance in CPU cycles or HT-link cycle? Last fiddled with by Kyle on 2012-11-26 at 13:33 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Building 14.1/autoconf on Solaris 11/SPARC | olegkirillov | Mlucas | 12 | 2021-06-17 01:19 |
| Mprime on Solaris for ECM? | D. B. Staple | Software | 7 | 2008-01-16 19:39 |
| GIMPS Mersenne prime clients on Solaris? | rx7350 | Software | 4 | 2007-02-28 04:05 |
| Need binaries for Solaris x64 | rgiltrap | Software | 4 | 2006-04-27 06:55 |
| Solaris 10 | moo | Software | 0 | 2004-12-01 01:56 |