LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   problem with openmpi (https://www.linuxquestions.org/questions/linux-software-2/problem-with-openmpi-697462/)

hanamilani 01-15-2009 05:57 AM

problem with openmpi
 
Dear all,

I have to run my code in parallel, therefore, I have installed openmpi-1.2.8 on a core2quad system with suse 11.0 linux and gfortran compiler. I have also downloaded blacs and scalapack from: http://www.netlib.org/scalapack/scalapack_installer.tgz.

Everything has gone smoothly in installing the code and enabling mpi for it, but when I want to run my test I receive the following error:

mpirun noticed that job rank 0 with PID 2407 on node linux-4pel exited on signal 15 (Terminated).

Please let me know how to solve this problem.

Regards,
Hana

TB0ne 01-15-2009 08:44 AM

Quote:

Originally Posted by hanamilani (Post 3409542)
Dear all,

I have to run my code in parallel, therefore, I have installed openmpi-1.2.8 on a core2quad system with suse 11.0 linux and gfortran compiler. I have also downloaded blacs and scalapack from: http://www.netlib.org/scalapack/scalapack_installer.tgz.

Everything has gone smoothly in installing the code and enabling mpi for it, but when I want to run my test I receive the following error:

mpirun noticed that job rank 0 with PID 2407 on node linux-4pel exited on signal 15 (Terminated).

Please let me know how to solve this problem.

Regards,
Hana

How about what your 'test' is?? You don't post anything about that, only that the job failed. Doesn't give us alot to go on....

hanamilani 01-15-2009 11:52 PM

problem with openmpi
 
Hello,

The test I have comes from a simulating code called SIESTA, when I ./configure --enable-mpi for this code it generates an arch.make file in which I mention the addresses of blacs and scalapack for it and then make the code.

My intention to say "test" is the input file I make to simulate and I run it whether sequentially or in parallel.

If you want take a look at the arch.make as enclosed to see whether the problem occurs from that.

Regards,

# This file is part of the SIESTA package.
#
# Copyright (c) Fundacion General Universidad Autonoma de Madrid:
# E.Artacho, J.Gale, A.Garcia, J.Junquera, P.Ordejon, D.Sanchez-Portal
# and J.M.Soler, 1996-2006.
#
# Use of this software constitutes agreement with the full conditions
# given in the SIESTA license, as signed by all legitimate users.
#
.SUFFIXES:
.SUFFIXES: .f .F .o .a .f90 .F90

SIESTA_ARCH=i686-pc-linux-gnu--Gfortran

FPP=
FPP_OUTPUT=
FC=/usr/local/bin/mpif90
RANLIB=ranlib

SYS=nag

SP_KIND=4
DP_KIND=8
KINDS=$(SP_KIND) $(DP_KIND)

FFLAGS=-g -O2
FPPFLAGS= -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
LDFLAGS=

ARFLAGS_EXTRA=

FCFLAGS_fixed_f=
FCFLAGS_free_f90=
FPPFLAGS_fixed_F=
FPPFLAGS_free_F90=

BLAS_LIBS=/home/hana/scalapack_installer_0.94/lib/librefblas.a
LAPACK_LIBS=/home/hana/scalapack_installer_0.94/lib/libreflapack.a
BLACS_LIBS=/home/hana/scalapack_installer_0.94/lib/blacs.a /home/hana/scalapack_installer_0.94/lib/blacsC.a
SCALAPACK_LIBS=/home/hana/scalapack_installer_0.94/lib/libscalapack.a

COMP_LIBS=dc_lapack.a liblapack.a libblas.a

NETCDF_LIBS=
NETCDF_INTERFACE=

LIBS=$(SCALAPACK_LIBS) $(BLACS_LIBS) $(LAPACK_LIBS) $(BLAS_LIBS) $(NETCDF_LIBS)

#SIESTA needs an F90 interface to MPI
#This will give you SIESTA's own implementation
#If your compiler vendor offers an alternative, you may change
#to it here.
MPI_INTERFACE=libmpi_f90.a
MPI_INCLUDE=.

#Dependency rules are created by autoconf according to whether
#discrete preprocessing is necessary or not.
.F.o:
$(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_fixed_F) $<
.F90.o:
$(FC) -c $(FFLAGS) $(INCFLAGS) $(FPPFLAGS) $(FPPFLAGS_free_F90) $<
.f.o:
$(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_fixed_f) $<
.f90.o:
$(FC) -c $(FFLAGS) $(INCFLAGS) $(FCFLAGS_free_f90) $<
[/SIZE]

TB0ne 01-16-2009 08:13 AM

Quote:

Originally Posted by hanamilani (Post 3410654)
Hello,

The test I have comes from a simulating code called SIESTA, when I ./configure --enable-mpi for this code it generates an arch.make file in which I mention the addresses of blacs and scalapack for it and then make the code.

My intention to say "test" is the input file I make to simulate and I run it whether sequentially or in parallel.

If you want take a look at the arch.make as enclosed to see whether the problem occurs from that.

The job itself is where the problem(s) are...and have you looked to see what signal 15 is????

hanamilani 01-16-2009 10:06 AM

Hi,

I am completely at a loss of it.

I have shared this question in Siesta and openmpi mailing lists as well, butthey could not say anything about it too!!!

Please let me know whetever I should check to see what signal 15 is.

TB0ne 01-16-2009 10:19 AM

Quote:

Originally Posted by hanamilani (Post 3410975)
Hi,

I am completely at a loss of it.

I have shared this question in Siesta and openmpi mailing lists as well, butthey could not say anything about it too!!!

Please let me know whetever I should check to see what signal 15 is.

There is lots of information regarding signal:15 on the OpenMPI support forum:

http://www.open-mpi.org/community/help/

An openMPI tutorial is here:

http://www.slac.stanford.edu/comp/unix/farm/mpi.html

hanamilani 01-16-2009 11:56 AM

I have once read the tutorial and I work directly with openmpi for parallel run for another code.

But, this one which deals with blacs and scalapack has made such
difficulty~!!!

From the openmpi mailing list they told me that:

"Have you checked to ensure that the job manager is not killing your
job? As I mentioned yesterday, SIGTERM is usually when some external
agent kills your job."

what's the job manager? should I look for it in Siesta or openmpi? please let me know what your idea is.

TB0ne 01-16-2009 12:22 PM

Quote:

Originally Posted by hanamilani (Post 3411082)
I have once read the tutorial and I work directly with openmpi for parallel run for another code.

But, this one which deals with blacs and scalapack has made such
difficulty~!!!

From the openmpi mailing list they told me that:

"Have you checked to ensure that the job manager is not killing your
job? As I mentioned yesterday, SIGTERM is usually when some external
agent kills your job."

what's the job manager? should I look for it in Siesta or openmpi? please let me know what your idea is.

My idea would be for you to contact the openpmi people...this issue is with that software, not Linux.


All times are GMT -5. The time now is 07:57 AM.