Re: package nan warnings

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: package nan warnings

From:	Alois Schloegl
Subject:	Re: package nan warnings
Date:	Sat, 04 Aug 2012 02:10:01 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20120613 Icedove/3.0.11

On 2012-08-03 19:21, Max Brister wrote:

On Thu, Aug 2, 2012 at 4:27 PM, Alois Schloegl<address@hidden>  wrote:

On 2012-08-02 22:43, Jordi Gutiérrez Hermoso wrote:


On 2 August 2012 16:40, Alois Schloegl<address@hidden>   wrote:

3) after installing the NaN-toolbox,  sum([1 NaN 2]) will still result in
NaN. But with the NaN-toolbox you have an additional function
sumskipnan([1,NaN,2]) which gives 3.



Why don't you name all of your functions this way and not shadow core
functions, then? For example, why do you overwrite sumsq?

- Jordi G. H.




Ok, sumsq() is a borderline case because you might argue that is not
necessarily a statistical function.

But for the other functions, why should one need to thing about whether to
use var() or nanvar(), mean() or nanmean(), std() or nanstd() ? There is no
need for the NaN-propagating version, you always should use the nan-skipping
version.


This is not always true. For example, lets say I want to write a
quick, simple test to see if rand is working. I might write something
like

assert (mean (rand (10000)(:)), .5, .1); # the mean value of rand
should be around .5

I expect this case to fail if rand produces a NaN.



Hi Max,


thanks for your interest and your attempt to find a solution.

rand() does never produce NaN, so it's not a good example. But letsassume there is some myrand()- functions, and it can produce NaN, I'dexpect that NaN is an encoding for missing values. In that case, mean()should ignore the NaN's.

If you need to test for NaN's, do it in an explicit way usingany(isnan(x(:))). That's much cleaner, and others will know that yourcode is testing for NaN's. The problem with implicit NaN-propagation isthat it is very difficult to know, whether the NaN-handling has been isa conscious decision or is just a arbitrary side-effect.

When one tries to solve a challenging problem, why should one need to thing
about whether to use var(), nanvar(), or some_other_varfunction() ? There is
just no need such proliferation of function names - all doing basically the
same.


As far as the user is concerned, I agree with you. If a user installs
the NaN package when they 'var' they want the nan skipping version. I
do not think we should be spitting out a bunch of warnings as what the
user wants is unambiguous.

On the other hand, this creates an issue for scripts in core. Your
functions are doing basically, but not quite the same thing. When
writing scripts in core I expect NaNs to be propagated. It leads to a
maintenance nightmare if you can not be sure of exactly how a function
behaves (see gnulib/autotools).

The functions in core and the NaN-tb are doing the same, except for theNaN-propagation thing. Even the core function do not mention in thedocumentation that NaN's are propagated (see help mean, help var). So,the NaN-handling is really not strictly defined. Applications that relyon NaN-propagation depended on some undocumented behaviour. If you needto test for NaN's, one should do it in an explicit way, e.g. usingany(isnan(x(:))). That avoids any ambiguity about NaN handling in yourcode.

Concerning you suggestion "to partition the namespaces (classes)". To me
this sounds like 2nd class citizens. But perhaps it's just me, and being not
familiar with this technique. In that case, it would be best if someone else
would transform the NaN-tb into a more compatible mode. I'm open for
suggestions.


A more practical solution would be to use a package [1]. The main
problem here is that Octave does not support packages (yet). What do
you think about having NaN inside of a package?

[1] http://www.mathworks.com/help/techdoc/matlab_oop/brfynt_-1.html

I do not know - the concept of "package" must be quite new, and I'venever used it. It seems to me that it is another way to move the issueto some other namespace/class/packages.

These "solutions" have one thing in common, they are just a badcompromise, to sidestep the really address - namely what kind ofNaN-handling should be the default for statistical functions.

However, if you believe that there is some need for a compromisesolution, a solution based on packages might be a good idea. In thatcase, just do it.

Alois


Max Brister

[Prev in Thread]

Current Thread

[Next in Thread]

package nan warnings, Yin, Yue-Jun, 2012/08/01
- Re: package nan warnings, Carnë Draug, 2012/08/01
  - Re: package nan warnings, Alois Schloegl, 2012/08/02
    - RE: package nan warnings, Yin, Yue-Jun, 2012/08/02
    - Re: package nan warnings, Jordi Gutiérrez Hermoso, 2012/08/02
    - Re: package nan warnings, Alois Schloegl, 2012/08/02
    - Re: package nan warnings, Jordi Gutiérrez Hermoso, 2012/08/02
    - Re: package nan warnings, Alois Schloegl, 2012/08/02
    - Re: package nan warnings, Max Brister, 2012/08/03
    - Re: package nan warnings, Alois Schloegl <=
    - Re: package nan warnings, Jordi Gutiérrez Hermoso, 2012/08/02
    - Re: package nan warnings, Alois Schloegl, 2012/08/02
    - Re: package nan warnings, Jordi Gutiérrez Hermoso, 2012/08/02

Prev by Date: general 1.3.2 package release
Next by Date: Re: Octave and the new Mountain Lion Operating System
Previous by thread: Re: package nan warnings
Next by thread: Re: package nan warnings
Index(es):
- Date
- Thread