octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More extended performance profiling


From: Rik
Subject: Re: More extended performance profiling
Date: Mon, 12 Aug 2019 09:18:18 -0700

On 08/12/2019 08:00 AM, John W. Eaton wrote:
On 8/11/19 8:48 PM, Rik wrote:
I was able to get versions 3.2.4 and 4.0.3 to compile so I have a longer baseline of performance measurements.  Using bm_toeplitz.m, there has been more than 100% slowdown since 3.2.4 to current dev.

*Version*     *3.2.4*     *3.4.3*     *3.6.4*     *3.8.2*     *4.0.3*     *4.2.1* *4.4.1*     *5.1.0*     *dev (6.1.0)*
runtime     5.8968    
    
    
    10.055     10.544     13.052     13.481     13.291

Is there a bug report open where we could track this info?

I have opened a new report here https://savannah.gnu.org/bugs/index.php?56752.


Please post the changes you made to build 3.2.4 with current tools.  I tried but ran into numerous problems and if you have already done the work it would save others time in trying to duplicate the issues.


Going backwards is hard.  I still haven't got 3.4 - 3.8 to build.  I will post my configure scripts and patches for each version (eventually) to bug #56752.

The bm_toeplitz script includes an indexed assignment, a function call, and a binary arithmetic operations in the loop.  As a quick check to see if just one of these might be the greatest contributing factor to the slowdown, could you try simple loops that have only one of these features at a time?  For example, try replacing the statement in the loop with

  b(k,j) = 13;   ## indexed assignment only, with a constant value

  abs(13);       ## function call only with a constant value (Octave is not smart enough to eliminate the call or move it out of the loop)

  abs(k);        ## function call only with one variable value lookup

  k+1;           ## simple binary arithmetic with one variable value lookup (Octave is dumb, so it will do this operation every time through the loop).

Also, does it matter whether there is more than one index in the indexed assignment _expression_ or more than one nested loop?

What happens with an empty loop body?

This was my first thought.  I re-purposed the original for loop test with an empty body

a = 1; b = 1; tic; for i=1:1000; for j=1:1000;   ; end; end; toc

Results below do show a slowdown of ~100%.

Version 3.2.4 3.4.3 3.6.4 3.8.2 4.0.3 4.2.1 4.4.1 5.1.0 dev (6.1.0)
bm.empty_loop.m 0.053467


0.0782108 0.0723422 0.128019 0.1248 0.096776

However, absolute numbers are very small.  This is 50 milliseconds over 1 million loop iterations.  Applied to the toeplitz benchmark it would only be around 100 milliseconds and I'm seeing 7 seconds of difference between 3.2.4 and dev for that.


Using the current sources, is there any significant difference when running with the GUI vs. octave-cli?

Not significant.  I used toeplitz benchmark and the following options with run-octave

-f --no-gui-libs : 13.472
-f --no-gui        : 13.715
-f --gui             : 13.800

--Rik


reply via email to

[Prev in Thread] Current Thread [Next in Thread]