ulimit -s 40960 vs ulimit ulimit -s 10240
I wrote this because i was able to use openmpi to run mpirun
on my 12-core workstation rather happily since day 1 I setup the system a few months ago.
Yesterday when I tried to run a big job under mpirun, the job crashed
rather quickly, the error message was something like
mpirun process exited blah blah with signal 11 (Segmentation fault).
Interestingly (or annoyingly) a job required less memory ran okay.
Since I never had this problem before, I thought it was the hardware
failure. I called my IT guy to explain the problem and he is kind
enough to suggest to put a line
ulimit -s 40960
in my .bashrc.
And it works!
But I have no clue why mpirun misbehaves out of a sudden, and that
ulimit setting solves the problem completely. I would like to learn
from this incident.
Anyone has any idea to share ? Thanks a lot!
|