help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Build 2.1.64 on OS 10.3. error


From: Per Persson
Subject: Re: Build 2.1.64 on OS 10.3. error
Date: Tue, 21 Dec 2004 19:22:38 +0100


On Dec 21, 2004, at 18:55, Samir Sharshar wrote:

Hello,

It's me ....

With ./configure --enable-dl --enable-shared --disabled-static

I've got

ld: misc/machar.o has local relocation entries in non-writable section (__TEXT,__text)
/usr/bin/libtool: internal link edit command failed
make[3]: *** [libcruft.dylib] Error 1
make[2]: *** [libraries] Error 2
make[1]: *** [libcruft] Error 2
make: *** [all] Error 2

Fortran compiler g77
FLIBS='-lg2c'
FFLAGS='-O5 -funroll-loops'
CFLAGS='-fast -mdynamic-no-pic'
CXXFLAGS='-fast -mdynamic-no-pic'

First of all, let me quote the docs for -fast (final paragraph of -fast section in <http://www.opensource.apple.com/darwinsource/10.3.6/gcc-1495/ AppleReleaseNotes.html>):
-----
Users of -fast should be aware of the following caveats:
• Because -fast enables highly aggressive optimizations, some of which may have an effect on code size or on program behavior, thorough testing is especially important before deploying applications compiled with -fast. • For maximum run-time performance you should experiment with a variety of optimization options; no one set of flags is best for all applications.
-----

This, unfortunately, translates to "using -fast may wreak havoc, analyze the code and apply the appropriate flags one by one".

Secondly, -mdynamic-no-pic is meant for executables, not libraries which needs to have relocatable code. Check "man gcc". You need to make sure that -fPIC is passed alongside -mdynamic-no-pic if you want to build relocatable code with -mdynamic-no-pic applied globally.

Finally, my advice would be to start by dropping -fast -mdynamic-no-pic and just add the -mcpu option specifying _your particular_ cpu. If that turns out well, analyze the code and incrementally add flags that you have reason to believe will improve performance, building between each increment until you have obtained a satisfying speedup.

If you are really serious, use Shark (from Apple) to analyze the code for things like pipeline stalls etc.

Sorry if I'm sounding negative, but it is my experience that applying something as agressive[1] as the -fast option will not work well on something as complex as octave. As I understand it, -fast was added to give good SPEC marks, and has sure seen little testing with code other than the SPEC code.

HTH,
Per

PS. For interested parties I'm pasting a summary of what -fast implies below:
================================
-fast changes the overall optimization strategy of GCC 3.3 in order to produce the fastest possible running code for G4 and G5 architectures. Optimizations under -fast are roughly grouped under the following categories.
        1.      
-fast sets the optimization level to -O3, the highest level of optimization supported by GCC 3.3. If any other optimization level (-O0, -O1, -O2 or -Os) is specified, it is ignored by the compiler.
        2.      
Alignment. Assume alignments for loops, functions, branches and structure data fields that provide fastest performance on the PowerPC. -fast sets the following alignment-specific options:
   -falign-loops-max-skip=15
   -falign-jumps-max-skip=15
   -falign-loops=16
   -falign-jumps=16
   -falign-functions=16
   -malign-natural

        3.      
-fast enables the -ffast-math option, which allows certain unsafe math operations for performance gains.
        4.      
Strict aliasing rules. -fast allows the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C and C++, this activates optimizations based on the type of expressions: an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. Furthermore, struct field references are assumed not to alias each other as long as their direct and indirect enclosing structure types are distinct. -fast enables the following aliasing options:
   -fstrict-aliasing
   -frelax-aliasing
   -fgcse-mem-alias


Warning: the behavior of correct programs will not be affected by strict aliasing, but programs that make use of nonportable type conversions may behave in unexpected ways.
        5.      
-fast enables various performance-related code transformations. These include loop unrolling, transposing nested loops to improve locality of array element access, conversion of certain initiliazation loops to memset calls, and inline expansion of calls to library functions such as floor. -fast enables the following code transformation options:
   -funroll-loops
   -floop-transpose
   -floop-to-memset
   -finline-floor  (G5 only)


 Some of these transformations increase code size.
        6.      
G5 specific instruction generation. With -fast (unless -mcpu=G4 is specified), GCC 3.3 generates instructions which are specific to G5 and result in performance gain for G5. The following options are assumed for G5 under -fast:

   -mcpu=G5
   -mpowerpc64
   -mpowerpc-gpopt

        7.      
Scheduling changes. -fast option allows inter-block scheduling, and scheduling specific to the G5 architecture. One such scheduling change is load after a store that partially loads what was stored. The following scheduling-related options are enabled by -fast:
   -mtune=G5  (unless -mtune=G4 is specified)
   -fsched-interblock
   -fload-after-store
   --param max-gcse-passes=3
   -fno-gcse-sm
   -fgcse-loop-depth

        8.      
-fast enables intermodule inlining when all source files are placed on the same command line. The following options are set by -fast and affect such inlining:
   -funit-at-a-time
   -fcallgraph-inlining
   -fdisable-typechecking-for-spec

        9.      
-fast sets -mdynamic-no-pic by default. This allows for generation of non-relocatable code and is not suitable for shared libraries. This option may be overridden by -fPIC.

Users of -fast should be aware of the following caveats:
• Because -fast enables highly aggressive optimizations, some of which may have an effect on code size or on program behavior, thorough testing is especially important before deploying applications compiled with -fast. • For maximum run-time performance you should experiment with a variety of optimization options; no one set of flags is best for all applications. • In future releases of GCC, -fast may enable a different set of optimization options. The intention behind this option is that -fast will enable optimizations that result in the fastest code for most applications.



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]