Re: [OctDev] optimized lookup

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OctDev] optimized lookup

From:	John W. Eaton
Subject:	Re: [OctDev] optimized lookup
Date:	Tue, 19 Feb 2008 04:19:24 -0500

On 19-Feb-2008, Jaroslav Hajek wrote:

| OK, I've removed the function from Octave-Forge. What do you think
| about including
| this function into Octave (i.e. replacing the current m-file version?)
| 
| My primary motivation for optimizing this was the TODO comment in the
| current lookup.m,
| that actually suggests writing a such an implementation, pointing out
| the worse asymptotic
| complexity and real-life performance. It even seems there had been a
| binary search m-file implementation before, but was later rejected for
| being too slow.

As David said, we also have to consider whether the speedup is worth
the extra maintenence cost (our experience shows that many fewer
people who use Octave can effectively contribute fixes to C++ code).

In any case, if it is to be included a number of style issues need to
be addressed so that it is more consistent with the rest of Octave
(see below for some comments).  Although these changes may seem
trivial, I think it is important to keep the style consistent so that
the whole of Octave is easier to read.

| Copyright (C) 2008 VZLU Prague a.s., Czech Republic

Why not Copyright Jaroslav Hajek?  Does your employer require the
above statement?  Do they claim ownership of the code you write?  If
so, do you have permission from them to release the code under the
terms of the GPL?

| #include <oct.h>

In code that is part of Octave, you need to conditionally include
<config.h> and then only the subset of header files that are really
needed.  We don't want every file depending on (nearly) all the header
files in the Octave source tree.  The typical organization of header
files is

  <config.h>

  standard C/C++ headers

  liboctave headers

  octave headers

In each grouping, the headers are typically listed alphabetically.

| // use this to select an appropriate output index type
| 
| #define RETURN_REAL_INDICES
| #ifdef RETURN_REAL_INDICES
| typedef double ret_idx_type;
| typedef NDArray ret_idx_array;
| #elif USE_64_BIT_IDX_T
| typedef octave_uint64 ret_idx_type;
| typedef uint64NDArray ret_idx_array;
| #else 
| typedef octave_uint32 ret_idx_type;
| typedef uint32NDArray ret_idx_array;
| #endif

I'm not sure whether much is gained by returning ints, and in any
case, you should just use octave_idx_type so you don't have to check
bit sizes.

| // simple function templates & specialization is used
| // to avoid writing two versions of the algorithm
| 
| // generic comparison
| template<bool rev>
| inline bool dcomp(const double& a,const double& b);
| template<bool rev>
| inline bool dcompe(const double& a,const double& b);

Please use a space before the "(" that begins a function parameter
list and after commas.  Also, we usually separate function definitions
and declarations by a blank line so it is easier to see where one ends
and the next begins.

|   if (!nvals) return;

Please write this as

  if (! nvals)
    return;

|   const double *first, *last;
|   size_t cur;
|   // the trivial case of empty array
|   if (begin == end) 
|     {
|       for(;nvals;--nvals) *(idx++) = 0;
|       return;
|     }
|   // initialize 
|   last = end;
|   first = begin;

Please declare varibles with their first use if possible.  It might
also be good to provide an initial value for cur above.  So the code
above would be

  size_t cur = 0;
  // the trivial case of empty array
  if (begin == end) 
    {
      for (; nvals; --nvals)
        *(idx++) = 0;
      return;
    }

  // initialize 
  const double *last = end;
  const double *first = begin;


| bin_search:    
| store_cur:
|           if (dcompe<rev>(*last,*vals)) goto search_forwards;
|         if (dcomp<rev>(*vals,*first)) goto search_backwards;
|           if (dcompe<rev>(*last,*vals)) goto search_forwards;
|           if (dcomp<rev>(*vals,*first)) goto search_backwards;
| search_forwards:
| search_backwards:
|     if (!(--dist))
|       goto store_cur; // a shortcut
|     else
|       goto bin_search;

This algorithm seems quite complex with the labels and backward
branching.  Is this the cleanest way it could be written?  I imagine
some difficulty for someone later who might need to modify it or fix a
bug.

| DEFUN_DLD(lookup,args,nargout,"\
| -*- texinfo -*- \n\
| @deftypefn {Function File} address@hidden =} lookup (@var{table}, @var{y}) \n\

Please write

DEFUN_DLD (lookup, args, nargout,
  "-*- texinfo -*-\n\
@deftypefn {Function File} address@hidden =} lookup (@var{table}, @var{y})\n\

There is no need for spaces before the newline characters.

| If complex values are supplied, their magnitudes are used. \n\
| @end deftypefn \n")

The last newline character here doesn't do any harm, but it is not
necessary, so we don't usually include it.

|   int nargin = args.length();
| 
|   if (nargin != 2 || nargout > 1) 
|     {
|       print_usage();
|       return octave_value_list();
|     }

In most cases, we try to have a single return from functions, so the
preferred style for argument checking is

  if (arguments_are_ok)
    {
      // do the real work.
    }
  else
    print_usage ();

|   if (atable.ndims() > 2 || 
|       (atable.rows() > 1 && atable.columns() > 1)) 

When logical expressions span multiple lines, please write 

  if (atable.ndims() > 2
      || (atable.rows() > 1 && atable.columns() > 1))

|       error("table should be a vector");

Please begin error messages with the name of the function.  For
example,

  error ("lookup: table should be a vector");

|       return octave_value_list();
|     }
|   NDArray ay;
|   if (y.is_real_type()) 
|     ay = y.array_value();
|   else if (y.is_complex_type())
|     ay = y.complex_array_value().abs();
|   else 
|     {
|       error("y is not numeric type");
|     }
|
|   ret_idx_array idx(y.dims());

You shouldn't continue after calling error because it doesn't throw an
exception.

|   seq_lookup(atable.data(),(size_t)atable.numel(),ay.data(),
|       (size_t)ay.numel(),idx.fortran_vec());

Casts are discouraged, but if you must use them, please use C++ style
casts since they are easier to find.

|   return octave_value_list(octave_value(idx));

There is a constructor which will automatically convert an
octave_value object to an octave_value_list object, so the above can
be written as

  return octave_value (idx);

jwe

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [OctDev] optimized lookup, David Bateman, 2008/02/18
- Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/18
  - Re: [OctDev] optimized lookup, David Bateman, 2008/02/18
    - Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/18
    - Re: [OctDev] optimized lookup, David Bateman, 2008/02/18
    - Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/18
  - Re: [OctDev] optimized lookup, John W. Eaton, 2008/02/19
    - Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/19
    - Re: [OctDev] optimized lookup, John W. Eaton <=
    - Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/19
    - Re: [OctDev] optimized lookup, John W. Eaton, 2008/02/19
    - Re: [OctDev] optimized lookup, John W. Eaton, 2008/02/19
    - Re: [OctDev] optimized lookup, Jaroslav Hajek, 2008/02/20

Prev by Date: Re: goals for 3.1
Next by Date: Re: [OctDev] optimized lookup
Previous by thread: Re: [OctDev] optimized lookup
Next by thread: Re: [OctDev] optimized lookup
Index(es):
- Date
- Thread