[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: handling NaN

From: Paul Kienzle
Subject: Re: handling NaN
Date: Thu, 8 Aug 2002 12:08:42 -0400

On Wed, Aug 07, 2002 at 11:28:54PM +0200, Schloegl Alois wrote:
> On Wed, 31 Jul 2002, John W. Eaton wrote:
> > On 31-Jul-2002, Paul Kienzle <address@hidden> wrote:
> > 
> > | Or you can use nanmax as defined in octave-forge (,
> > | which does the same thing:
> > 
> > I've also noticed that Matlab now removes NaN automatically in min and
> > max (but apparently not yet in std, mean, etc.).  Although I would
> > prefer to distinguish between missing and NaN, maybe Octave should
> > ignore NaNs in min and max for compatibility?
> I guess this is one of those "should octave do bug-for-bug compatibility?"
> type questions.
> <rant>
> My personal opinion is that min or max _should_ by default
> retain NaN values. NaN normally means that something significant
> happened. That significant event should be retained by default.
> The choice to ignore NaN should be semantically explicit,
> such as by using nanfunction.
> </rant>
> If the octave-froge implementation of nanmax.m is too slow,
> then we can always rewrite it as a *DLD function.
> ----------------------------------------------------
> Andy,
> from the viewpoint of statistics and stochastic signal processing, I can not 
> agree with your opinion.  
> 1) If NaN's are important, it can be always tested with ISNAN. 
> 2) There is also no reason to have two different functions doing basically 
> the 
> same (like FUN and NANFUN). Most of the time its more difficult to decide 
> whether FUN or NANFUN is better, than an explicit check with ISNAN. 
> 3) The code is more readable if you use explicit check for NaN's rather than 
> an 
> implicit check (by using FUN or NANFUN)
> The consequences are that NaN's should be omitted by default. 

The consequences are that MISSING VALUES should be omitted by default.  NaNs are
not missing values in many (most) computations, but instead are things that 
you that your computation has gone awry.  If your code does not explicitly deal
with these problems itself (e.g., by using nanfun) then the invalid value should
be propogated back to somewhere that the user can see it.

One suggestion is to define a constant NA to represent missing values.  These
could then be ignored in all the stats functions.  I believe this would work
well enough, except for the people coming from matlab who want to use NaN to
represent missing values.

It might work better to do the following:  Use NaN for missing values and
treat it appropriately in the stats functions as Alois is doing, but make
use of the ieeefp invalid operation sticky bit.  Every time octave returns
to the command line it could test and report that sticky bit was set and
clear it before the next command.  That way the user will always be
informed of an unexpected NaN, and expected NaNs are ignored.  Note that we
will also need functions to test and clear the sticky bit within our
scripts so that the user is not notified about NaNs we have already
accounted for.

Paul Kienzle

Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:
How to fund new projects:
Subscription information:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]