help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Very slow filter.cc


From: John W. Eaton
Subject: Re: Very slow filter.cc
Date: Fri, 21 Jan 2005 12:20:10 -0500

On 21-Jan-2005, Miroslaw Kwasniak <address@hidden> wrote:

| On Wed, Jan 19, 2005 at 09:45:37AM +0100, David Bateman wrote:
| > 
| > t = cputime; y = filter(b,1,x); cputime - t
| > 
| > rather than using tic/toc and see if there is better consistency between 
| > 2.0.17 and the 2.1.x versions...
| 
| $ sh run.sh
| OCTAVE_VERSION = 2.0.17
| ans = 2.0600
| OCTAVE_VERSION = 2.1.50
| ans = 15.380
| OCTAVE_VERSION = 2.1.57
| ans = 15.690
| OCTAVE_VERSION = 2.1.63
| ans = 14.060
| 
| 2.1.63 is a little better because it's compiled for 686 - rest are debian
| 386 packages.

Please try the following patch.

| I tried with LD_PROFILE but octave has PROF signal disabled :(

What is LD_PROFILE?

| I would supprised if 2.0 is compilable with current tools.

Only a few changes need to be made to filter.cc from 2.0.x so that it
will compile with current versions of GCC.

jwe


src/ChangeLog:

2005-01-21  John W. Eaton  <address@hidden>

        * DLD-FUNCTIONS/filter.cc (filter): Avoid slow Marray indexing ops.


Index: src/DLD-FUNCTIONS/filter.cc
===================================================================
RCS file: /usr/local/cvsroot/octave/src/DLD-FUNCTIONS/filter.cc,v
retrieving revision 1.16
diff -u -r1.16 filter.cc
--- src/DLD-FUNCTIONS/filter.cc 2 Nov 2004 02:42:25 -0000       1.16
+++ src/DLD-FUNCTIONS/filter.cc 21 Jan 2005 16:59:05 -0000
@@ -146,35 +146,51 @@
 
       if (a_len > 1)
        {
-         for (int i = 0; i < x_len; i++)
+         T *py = y.fortran_vec ();
+         T *psi = si.fortran_vec ();
+
+         const T *pa = a.data ();
+         const T *pb = b.data ();
+         const T *px = x.data ();
+
+         psi += si_offset;
+
+         for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride)
            {
-             int idx = i * x_stride + x_offset; 
-             y (idx) = si (si_offset) + b (0) * x (idx);
+             py[idx] = psi[0] + pb[0] * px[idx];
 
-             if (si_len > 1)
+             if (si_len > 0)
                {
                  for (int j = 0; j < si_len - 1; j++)
                    {
                      OCTAVE_QUIT;
 
-                     si (j + si_offset) =  si (j + 1 + si_offset) - 
-                       a (j+1) * y (idx) + b (j+1) * x (idx);
+                     psi[j] = psi[j+1] - pa[j+1] * py[idx] + pb[j+1] * px[idx];
                    }
 
-                 si (si_len - 1 + si_offset) = b (si_len) * x (idx)
-                   - a (si_len) * y (idx);
+                 psi[si_len-1] = pb[si_len] * px[idx] - pa[si_len] * py[idx];
                }
              else
-               si (si_offset) = b (si_len) * x (idx)
-                 - a (si_len) * y (idx);
+               {
+                 OCTAVE_QUIT;
+
+                 psi[0] = pb[si_len] * px[idx] - pa[si_len] * py[idx];
+               }
            }
        }
       else if (si_len > 0)
        {
-         for (int i = 0; i < x_len; i++)
+         T *py = y.fortran_vec ();
+         T *psi = si.fortran_vec ();
+
+         const T *pb = b.data ();
+         const T *px = x.data ();
+
+         psi += si_offset;
+
+         for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride)
            {
-             int idx = i * x_stride + x_offset; 
-             y (idx) = si (si_offset) + b (0) * x (idx);
+             py[idx] = psi[0] + pb[0] * px[idx];
 
              if (si_len > 1)
                {
@@ -182,14 +198,17 @@
                    {
                      OCTAVE_QUIT;
 
-                     si (j + si_offset) = si (j + 1 + si_offset) + 
-                       b (j+1) * x (idx);
+                     psi[j] = psi[j+1] + pb[j+1] * px[idx];
                    }
 
-                 si (si_len - 1 + si_offset) = b (si_len) * x (idx);
+                 psi[si_len-1] = pb[si_len] * px[idx];
                }
              else
-               si (si_offset) = b (1) * x (idx);
+               {
+                 OCTAVE_QUIT;
+
+                 psi[0] = pb[1] * px[idx];
+               }
            }
        }
     }



-------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.octave.org
How to fund new projects:  http://www.octave.org/funding.html
Subscription information:  http://www.octave.org/archive.html
-------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]