![]() |
|
|
#1 |
|
"Ed Hall"
Dec 2009
Adirondack Mtns
15BC16 Posts |
I have acquired two cards and a cable and am hoping to connect two Z620s for LA work. I have installed one card in one machine, but nothing in the other, yet. So far, I've tried several web pages that are supposed to walk me though the process, but things don't succeed for me the way the pages succeed.
One example, I installed rmda-core, and was supposed to run systemctl start rmda.service, but the OS says there is no rmda.service. opensm installed and its start service returned as expected. The cards don't appear to have a name, but their Model# is HSTNS-BN80. The machines are Z620 dual Xeons running Ubuntu 20.04. A forum search for "infiniband" turned up three pages of threads, but I couldn't detect any helpful possibilities via the titles. Any assistance would be appreciated. |
|
|
|
|
|
#2 |
|
"Curtis"
Feb 2005
Riverside, CA
10110110111102 Posts |
I am very interested in doing this exact-same thing, if Ed gets his setup working. Two cards and a cable are fairly cheap used, and connecting just two machines means no need for a switch.
I'm told (well, I read on a website like the ones Ed found) that even with 2-port cards, one cannot connect three machines without a switch. I'd like to know if that is true! |
|
|
|
|
|
#4 | |
|
"Ed Hall"
Dec 2009
Adirondack Mtns
22×13×107 Posts |
Quote:
|
|
|
|
|
|
|
#5 |
|
Jul 2003
So Cal
266310 Posts |
Yes, in my experience most cluster systems are built on a Red Hat-based distro. Although it looks like Rocks hasn't been updated in a while, and the community appears to be moving to OpenHPC.
https://openhpc.community/ https://github.com/XSEDE/CRI_XCBC/tree/master/doc |
|
|
|
|
|
#6 | |
|
"Ed Hall"
Dec 2009
Adirondack Mtns
22·13·107 Posts |
Quote:
|
|
|
|
|
|
|
#7 |
|
"Ed Hall"
Dec 2009
Adirondack Mtns
22·13·107 Posts |
After quite a while of just letting everything sit, I thought about this again. Everything I was reading was pointing to the brand Mellanox. Of course, the brand of cards I had acquired was HP. Well, I bought two Mellanox cards and three cables (because they came that way). These actually gave me connectivity betwen the two machines without much trouble, kind of. I have Infiniband connected between the two machines, but now my Ethernet cluster for ecmpi doesn't work when I have the Infiniband enabled, even though the hostfile still uses the Ethernet addresses. It actually fails trying to use the Infiniband node for the localhost machine.
So, I have made some progress and am looking forward to making more as I spend more time "playing." Last fiddled with by EdH on 2022-01-29 at 15:55 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Bootable Prime95 Linux - gurus needed | tantryl | Software | 55 | 2008-06-09 00:30 |
| Factoring For Non-RSA-Gurus | koal | Puzzles | 5 | 2003-06-27 08:11 |