View Single Post
Old 2017-01-05, 13:57   #4
(loop (#_fork))
fivemack's Avatar
Feb 2006
Cambridge, England

13×491 Posts

Could you post the hosts files?

My suspicion is that mpirun has decided that it should bind processes to CPUs, and that you've somehow not told it that some of the hosts have more than one CPU ... what does 'taskset -p {process ID}' tell you when a process is running with insufficient CPU usage?

Aha, in a document at the Oxford supercomputer centre website, I found

Finally, versions higher than 1.8.0 in OpenMPI bind automatically processes to threads. Thus,

export OMPI_MCA_hwloc_base_binding_policy=none
so maybe see if doing that changes what you see happening?

Supercomputer centres almost always use something like Slurm or Torque for job submission, so I'm having a little trouble tying down how to get one-job-per-machine in the case without an extra layer.

Last fiddled with by fivemack on 2017-01-05 at 14:03
fivemack is offline   Reply With Quote