help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: execution speed in *oct files


From: John W. Eaton
Subject: Re: execution speed in *oct files
Date: Mon, 7 Jun 1999 23:50:58 -0500 (CDT)

On  8-Jun-1999, Eduardo Gallestey <address@hidden> wrote:

| Sorry for the neophyte question but I couldn't quite understand. 
| 
| Could someone say some words about that option

You only sent this message to me, but I'm also replying to the list
since others may be wondering the same thing.  If anyone else has
details to add, please feel free to post them to the list.

| -fno-strength-reduce 
| 
| For example, which is its role, does it belong to the c++ libraries or
| just to the "mkoctfile", which is its default value, ...?

This option inhibits the strength reduction and elimination of loop
iteration variables that is normally done by gcc with -O.  Strength
reduction replaces expensive operations by equivalent by cheaper
ones.

In the example I posted earlier, one could convert what was written as

  for (int j = 0; j < n; j++)
    for (int i = 0; i < m; i++)
      r(i,j) = a(i,j) + b(i,j)

but which, after inlining some function calls for doing the element
indexing was essentially

  for (int j = 0; j < n; j++)
    for (int i = 0; i < m; i++)
      {
        if (r.reference_count > 1)
          // grab a copy of r.data and set r.reference_count to 1.

        r.data[ldr*j+i] = a.data[ldb*j+i] + b.data[ldb*j+i];
      }

to

  if (r.reference_count > 1)
    // grab a copy of r.data and set r.reference_count to 1.

  double *pr = r.data
  double *pa = a.data
  double *pb = b.data

  for (int j = 0; j < n; j++)
    {
      for (int i = 0; i < m; i++)
        pr[i] = pa[i] + pb[i];

      pr += ldr;
      pa += lda;
      pb += ldb;
    }

(or better).  Using -fno-strength-reduce inhibits some of these
optimizations.  I'm not sure about moving the reference count check
outside the loop, but the other changes look like strength reduction
and elimination of iteration variables to me.  To know exactly what
this option inhibits on a particular machine, you really need to look
at the assembly code that gcc produces.

Normally, Octave uses -O2 with gcc/g++, so the strength reduction is
one of the optimizations that is enabled by default.

jwe



---------------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.  To ensure
that development continues, see www.che.wisc.edu/octave/giftform.html
Instructions for unsubscribing: www.che.wisc.edu/octave/archive.html
---------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]