help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: compile MPITB, octave 2.1.69


From: Javier Fernandez Baldomero
Subject: Re: compile MPITB, octave 2.1.69
Date: Fri, 01 Apr 2005 17:43:34 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Michael Creel wrote:

to missing symbol errors with 2.1.69. Just in case anyone knows what changes might have provoked this and how to fix the problem. The error follows. ...
MPITB extensions found
octave:1> kernel_example1
error: /home/mcreel/mpi_work/mpitb/DLD/MPI_Initialized.oct: undefined symbol: _ZN4PMPI4Comm12mpi_comm_mapE
error: `MPI_Initialized' undefined near line 10 column 17
error: called from `LAM_Init' in file `/usr/local/share/octave/site-m/mpitb_utils/LAM_Init.m'

Hi,

WRT the error message:
LAM_Init.m is an M-file that contains near the beginning:
> ...
>      [infI flgI]=MPI_Initialized;               % Init?
>      [infF flgF]=MPI_Finalized;                 % Finalize?
>      if infI ||  infF
> ...

that accounts for the last 3 lines of the error.
Then, Octave finds MPI_Initialized.oct in the DLD subdir,
but apparently your re-compiled version (oddly) requires some _ZN4... symbol

I'm not sure what might have caused that problem.
I see you added -lmpich and -lmpi++ to the library list...
With the original library list, the symbol MPI_Initialized comes from libmpi.so
and no other MPI library is required... This is the situation in my system:

____________________________________________
$ cd $MPITB_HOME/DLD
$ ls -la MPI_Initialized.oct
-rwx------  1 javier javier 18383 may  5  2004 MPI_Initialized.oct

$ ldd !$
ldd MPI_Initialized.oct
       linux-gate.so.1 =>  (0xffffe000)
       libmpi.so.0 => /home/javier/lam-7.1.1/lib/libmpi.so.0 (0xb7fa5000)
       liblam.so.0 => /home/javier/lam-7.1.1/lib/liblam.so.0 (0xb7f5f000)
       libutil.so.1 => /lib/libutil.so.1 (0xb7f49000)
       libc.so.6 => /lib/tls/i686/libc.so.6 (0xb7e39000)
       libpthread.so.0 => /lib/tls/i686/libpthread.so.0 (0xb7e28000)
       libdl.so.2 => /lib/libdl.so.2 (0xb7e24000)
       /lib/ld-linux.so.2 (0x80000000)

$ nm !$ | grep MPI
nm MPI_Initialized.oct | grep MPI
000018e4 T FSMPI_Initialized_gnu_v3
00001d40 t _GLOBAL__I_FSMPI_Initialized_gnu_v3
        U MPI_Initialized
00001b2e T _Z16FMPI_InitializedRK17octave_value_listi

$ nm MPI_Initialized.oct | grep U
        U __cxa_atexit@@GLIBC_2.1.3
        U error_state
        U __gxx_personality_v0
        U MPI_Initialized
...
        U _ZNSt8ios_base4InitC1Ev
        U _ZNSt8ios_base4InitD1Ev
        U _Znwj
____________________________________________

So, using nm I can tell which undefined symbols my .oct file relies on,
and using ldd I can tell which library is planning to get them from.
If I double-check for MPI_Init in my ldd list:

____________________________________________
$ cd $LAMHOME/lib
$ pwd
/home/javier/lam-7.1.1/lib
$ ls
lam        liblam.so    liblam.so.0.0.0  libmpi.so    libmpi.so.0.0.0
liblam.la  liblam.so.0  libmpi.la        libmpi.so.0

$ nm libmpi.so | grep Init
000173fc T MPI_Init
00017490 T MPI_Initialized
000174b8 T MPI_Init_thread
00045234 T PMPI_Init
000452c8 T PMPI_Initialized
000452f0 T PMPI_Init_thread

$ nm libmpi.so | grep map
00052db8 b cid_map
00052db4 b empty_map
0003b66c T lam_ptmalloc2_munmap
00052bb8 d map_size
        U mmap@@GLIBC_2.0
0000fb58 T MPI_Cart_map
000151b0 T MPI_Graph_map
        U munmap@@GLIBC_2.0
0003d9f0 T PMPI_Cart_map
00042fe8 T PMPI_Graph_map
____________________________________________

So, my libmpi.so does not contain any _comm_map symbol.
Since the offending symbol name was:

octave:1> kernel_example1
error: /home/mcreel/mpi_work/mpitb/DLD/MPI_Initialized.oct: undefined symbol: _ZN4PMPI4Comm12mpi_comm_mapE
error: `MPI_Initialized' undefined near line 10 column 17
error: called from `LAM_Init' in file `/usr/local/share/octave/site-m/mpitb_utils/LAM_Init.m'


I deduce that the _ZN4... symbol comes from the .oct file.
I mean, it's not a problem of the LAM library missing any symbol.
In the compile step you have inadvertently added a dependency on
that symbol -- symbol that I don't know, I'm sure it's not part of the
LAM/MPI. I knew MPI_Cart_map and MPI_Graph_map, but no
MPI_Comm::mpi_comm_map.

Sounds like some kind of C++ binding. Are you sure you have linked
against liblam/libmpi ?!? Perhaps those are missing in your system,
and since you added -lmpich and -lmpi++, the offending symbol
comes from there.

Look for _comm_map in libmpich and libmpi++ using nm
or some other similar tool just to be certain of the diagnostic,
but in any case, to get MPITB working, you'll probably need
liblam/libmpi (I cannot help porting MPITB to mpich/mpi++)

Make me know if the symbol came from libmpi++ and if
you manage to recompile against LAM libraries (removing
both -lmpi and -lmpi++ from the library list in MPICLIBS)

My original version was:

>
> MPICLIBS    = -L$(LAMHOME)/lib -lmpi -llam -lutil
>

_____________________________________________________

I'm getting different results when doing what are in principle the same calculations using Octave serially and in parallel, using the MPITB toolkit.

I remember the Pi demo in MPITB shows an in principle similar behaviour.
It integrates arctan' to compute pi. The integration is done round-robin
in the sense that the rectangles are indexed, and for N slaves 0..N-1,
the 0-th slave computes and sums the areas of rectangles i==0,N,2N...
1-st slave the areas of rectangles such that (i mod N)=1 and so on.

Depending on the number of slaves used, the computation is slightly
different, and all those computations are different from the sequential computation.

Of course, with the sequential computation, the whole sum ends up in the
same variable, so the last rectangles are very small compared to the accumulated
value (near to pi). When you use 10 computers, each one accumulates a value
close to pi/10, so the last rectangles are one order of magnitude better
(for rounding error purposes) that in the sequential version.

Perhaps your problem is related to this one (rounding errors) or perhaps not.
For the Pi example, the differences were unavoidable (I must sum the areas
that way if I'm expected to distribute the computations) and acceptable
(errors only one-two orders of magnitude above double-precision resolution,
ie: in the 14th-15th significant digit). Or was that single-precision?
Oh, my, can't remember ;-)

-javier

------------------------------------------------------------------------

# Makefile for MPITB on Debian unstable
...
WHEREARELIBS     := $(shell octave-config -p OCTLIBDIR)
...

MPICPPFLAGS = -I/usr/include/lam
# MPICLIBS seems to be necessary to avoid missing symbol errors
# Both of the following work, whether or not mpich is installed, and
# in spite of the fact that they make reference to files and/or
# directories that may not exist. # MPICLIBS = -L/usr/lib/mpich/lib/shared -llam -lutil -lmpich -lmpi++
MPICLIBS    = -L/usr/include/lam -llam -lmpi++ -lutil
...




-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]