help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Very slow filter.cc


From: David Bateman
Subject: Re: Very slow filter.cc
Date: Fri, 21 Jan 2005 18:31:58 +0100
User-agent: Mozilla Thunderbird 0.8 (X11/20040923)

John W. Eaton wrote:

On 21-Jan-2005, Miroslaw Kwasniak <address@hidden> wrote:

| On Wed, Jan 19, 2005 at 09:45:37AM +0100, David Bateman wrote:
| > | > t = cputime; y = filter(b,1,x); cputime - t | > | > rather than using tic/toc and see if there is better consistency between | > 2.0.17 and the 2.1.x versions... | | $ sh run.sh
| OCTAVE_VERSION = 2.0.17
| ans = 2.0600
| OCTAVE_VERSION = 2.1.50
| ans = 15.380
| OCTAVE_VERSION = 2.1.57
| ans = 15.690
| OCTAVE_VERSION = 2.1.63
| ans = 14.060
| | 2.1.63 is a little better because it's compiled for 686 - rest are debian
| 386 packages.

Please try the following patch.

| I tried with LD_PROFILE but octave has PROF signal disabled :(

What is LD_PROFILE?

| I would supprised if 2.0 is compilable with current tools.

Only a few changes need to be made to filter.cc from 2.0.x so that it
will compile with current versions of GCC.

jwe


src/ChangeLog:

2005-01-21  John W. Eaton  <address@hidden>

        * DLD-FUNCTIONS/filter.cc (filter): Avoid slow Marray indexing ops.


Index: src/DLD-FUNCTIONS/filter.cc
===================================================================
RCS file: /usr/local/cvsroot/octave/src/DLD-FUNCTIONS/filter.cc,v
retrieving revision 1.16
diff -u -r1.16 filter.cc
--- src/DLD-FUNCTIONS/filter.cc 2 Nov 2004 02:42:25 -0000       1.16
+++ src/DLD-FUNCTIONS/filter.cc 21 Jan 2005 16:59:05 -0000
@@ -146,35 +146,51 @@

      if (a_len > 1)
        {
-         for (int i = 0; i < x_len; i++)
+         T *py = y.fortran_vec ();
+         T *psi = si.fortran_vec ();
+
+         const T *pa = a.data ();
+         const T *pb = b.data ();
+         const T *px = x.data ();
+
+         psi += si_offset;
+
+         for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride)
            {
- int idx = i * x_stride + x_offset; - y (idx) = si (si_offset) + b (0) * x (idx);
+             py[idx] = psi[0] + pb[0] * px[idx];

-             if (si_len > 1)
+             if (si_len > 0)
                {
                  for (int j = 0; j < si_len - 1; j++)
                    {
                      OCTAVE_QUIT;

- si (j + si_offset) = si (j + 1 + si_offset) - - a (j+1) * y (idx) + b (j+1) * x (idx);
+                     psi[j] = psi[j+1] - pa[j+1] * py[idx] + pb[j+1] * px[idx];
                    }

-                 si (si_len - 1 + si_offset) = b (si_len) * x (idx)
-                   - a (si_len) * y (idx);
+                 psi[si_len-1] = pb[si_len] * px[idx] - pa[si_len] * py[idx];
                }
              else
-               si (si_offset) = b (si_len) * x (idx)
-                 - a (si_len) * y (idx);
+               {
+                 OCTAVE_QUIT;
+
+                 psi[0] = pb[si_len] * px[idx] - pa[si_len] * py[idx];
+               }
            }
        }
      else if (si_len > 0)
        {
-         for (int i = 0; i < x_len; i++)
+         T *py = y.fortran_vec ();
+         T *psi = si.fortran_vec ();
+
+         const T *pb = b.data ();
+         const T *px = x.data ();
+
+         psi += si_offset;
+
+         for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride)
            {
- int idx = i * x_stride + x_offset; - y (idx) = si (si_offset) + b (0) * x (idx);
+             py[idx] = psi[0] + pb[0] * px[idx];

              if (si_len > 1)
                {
@@ -182,14 +198,17 @@
                    {
                      OCTAVE_QUIT;

- si (j + si_offset) = si (j + 1 + si_offset) + - b (j+1) * x (idx);
+                     psi[j] = psi[j+1] + pb[j+1] * px[idx];
                    }

-                 si (si_len - 1 + si_offset) = b (si_len) * x (idx);
+                 psi[si_len-1] = pb[si_len] * px[idx];
                }
              else
-               si (si_offset) = b (1) * x (idx);
+               {
+                 OCTAVE_QUIT;
+
+                 psi[0] = pb[1] * px[idx];
+               }
            }
        }
    }

Grrr, I was writing my e-mail with basically the same patch.. What I wrote was

<quote>
Ok, I figured this one out. The problem is that the template filter functions don't define the inputs as const. Therefore the non-const versions of Array<T>::elem are being called by the operator (). This forces make_unique to be continually called on every reference to every element of all of the inputs to the filter function.

Defining a, b, and x as being const fixes most of the problem, but then requires that normalization and resizing are done in the main function. An additional problem is that the filter state in si also calls make_unique continually. This is needed once at the start of the filtering, but after that is not needed. At the moment I implemented a hack to call make_unique.
</quote>

Your solution is cleaner..

Cheers
David

--
David Bateman                                address@hidden
Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]