help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: trying to optimize my octave program


From: Tim Rueth
Subject: RE: trying to optimize my octave program
Date: Thu, 18 Mar 2010 10:45:13 -0700

You guys crack me up.  I didn't know programmers had such a good sense of
humor ;-)  I said this was my first *Octave* program (I know 750 lines isn't
really that much, and I did "hello world" a long time ago in Fortran on
punched cards).

Okay, here's the essence of my top-level script:

global sig long short c_thd r_thd t_thd b_thd;    # these are the loop
iteration vars
short_min =     5; short_max = 10; short_st = 1;
long_min = 20; long_max = 40; long_st = 1;
sig_min = 5; sig_max = 12; sig_st = 1;
c_thd_min =     -3; c_thd_max = -5; c_thd_st = -1;
r_thd_min =     1; r_thd_max = 3; r_thd_st = 1;
t_thd_min =     1; t_thd_max = 3; t_thd_st = 1;
b_thd_min =     0; b_thd_max = -2; b_thd_st = -1;

# Schmoo variables, calling function "analyze" for each data point
for sig = sig_min : sig_st : sig_max
  for long = long_min : long_st : long_max
    for short = short_min : short_st : short_max
      if short <= long      # which it should be, but just checking
        for c_thd = c_thd_min : c_thd_st : c_thd_max
          for r_thd = r_thd_min : r_thd_st : r_thd_max
            for t_thd = t_thd_min : t_thd_st : t_thd_max
              for b_thd = b_thd_min : b_thd_st : b_thd_max
                # Call function analyze, which uses global iteration vars to
operate on several global vectors
                [c_cntr b_cntr s_cntr avg val actg ddown pk_ddown] =
analyze(outfile);
                if actg > mx_actg
                  mx_short = short;
                  mx_long = long;
                  mx_sig = sig;
                  mx_c_thd = c_thd;
                  mx_r_thd = r_thd;
                  mx_t_thd = t_thd;
                  mx_b_thd = b_thd;
                end;
              end;
            end;
          end;
        end;
      end;
    end;
  end;
end;

Even though it's aesthetically beautiful, this 7-nested for-loop (evil!)
executes custom function "analyze" 6*21*8*3*3*3*3 = 81,648 times in this
case.  The iteration vars of the for-loops are global, and therefore are not
passed to the function as arguments.  Inside "analyze" is yet another
for-loop that runs 4288 iterations on several global vectors that are each
4288 x 1 in size.  So, the code inside analyze's for-loop gets executed
81648*4288 = 350e6 times.  Using tic/toc, it takes about 4.5 sec for this
function to execute.  So, 4.5sec/4288 = only 0.001 sec for the code inside
the loop within "analyze" to run.  With these parameters, the whole script
will run for about 81648*4.5 sec = 4.25 days.

Judd had mentioned that perhaps I'm not doing enough "work" inside the
inner-most loop.  Good point, since it only runs for 0.001 sec.  However, I
don't see a way to get more work done there without moving the next-outer
for-loop into the function, which is pointless.  Even though the inner-most
loop runs for only 0.001 sec, maybe I just need to focus on trying to
optimize this down even further.  Any reduction * 350e6 could be huge.
Which brings me to my original question of whether porting "analyze" to C
and compiling it would help much.  Or, from a memory management perspective,
since I have 11 vectors, each having a 4288x1 size, can/should I convert
these to int8, single, etc., data types within the vectors if possible?  I
don't see much point here, since it really isn't that much memory savings
overall.  If anyone has any other ideas, let me know.  You guys have been
quite supportive, and I appreciate it.

--Tim

-----Original Message-----
From: Jaroslav Hajek [mailto:address@hidden 
Sent: Wednesday, March 17, 2010 11:26 PM
To: address@hidden
Cc: Jordi Gutiérrez Hermoso; address@hidden
Subject: Re: trying to optimize my octave program

2010/3/17 Tim Rueth <address@hidden>:
> Jordi--
> Well, I certainly could, but it's >750 lines (incl. comments and white 
> space), including several functions.

Oh my, 750 lines! And comments, too! I'm scared :) No, seriously, show us
the code. If you are a novice, chances are good that there's a lot to
optimize. You don't need to show everything, surely you can identify some
critical parts yourself. tic/toc may help you.

> I doubt anyone would want to slog
> through this.  Also, the algorithm contains intellectual property 
> which is hard to pull out (but I could sanitize with some effort).

What? You mean it's trade secret?
I hope you realize that you (or your employer) hold the copyright to your
code, so nobody is implicitly entitled to redistribute it or even use it
without your permission.
You can omit the list if it's really a critical material (which I doubt,
given that it's supposed to be your starter), so that it won't be publicly
readable. If you don't trust *us* to obey the law, you may have a better
luck elsewhere.

PS. did you really mean it that your first program was 750 lines? No "hello
world"? Or did you mean first "real" program?

regards


--
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic
url: www.highegg.matfyz.cz
-----Original Message-----
From: Judd Storrs [mailto:address@hidden 
Sent: Thursday, March 18, 2010 6:42 AM
To: address@hidden
Cc: address@hidden
Subject: Re: trying to optimize my octave program

What you need to try and figure out is where the time is being spent.
Vectorization can work to get things fast but at a certain point the gains
can be offset by increased and less efficient use of memory. The more
tightly packed you can get your memory the better the hardware can work to
keep data flowing through the CPU. Purging for loops isn't always the best
solution. Generally speaking, the more work that is done with each pass
through the for loop, the less the execution time difference between
for-based and vectorized code. These links concern a different but somewhat
similar language called IDL:

Are FOR Loops Evil?
http://www.dfanning.com/tips/forloops.html

Are FOR Loops Really Evil?
http://www.dfanning.com/tips/forloops2.html

Not knowing even in general what sort of problems your code is facing, it's
difficult to suggest strategies. Now going out into wild speculation land
what I understand is that

* Your code takes over one day to complete
* Your code spends that time in a loop that takes possibly less than 0.01
sec

This implies that your loop is executed at least 24*60*60/0.01 = 8640000
times. If that isn't anywhere near how many times it is executed then your
cycles are being wasted elsewhere. In wild speculation land I would say your
inner loop isn't "doing enough work"
with each pass.

Again its difficult for us to guess about these things so if you can somehow
extract part of the algorithm or do some more detective work on where the
hangup is we may be able to help more specifically--as is all you've told us
is that you wish your code ran faster. Without more hints it's hard to help
you.


--judd




reply via email to

[Prev in Thread] Current Thread [Next in Thread]