help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: kolmogorov smirnov test. Normal distribution


From: Vitaly Repin
Subject: Re: kolmogorov smirnov test. Normal distribution
Date: Tue, 22 Apr 2008 14:23:00 +0300

Hello!

Thanks a lot for your responses.  But I am still confused a little.

Let me show why.  The data I used:

[ 7.11, 6.73, 6.95, 7.25, 7.25, 7.03, 7.10, 7.15, 6.78, 7.09, 7.37, 7.22, 6.82, 6.72, 6.95 ]

I used minitab statistical software to do the normality test on this data using kolmogorov-smirnov kriteria.

The visual result is presented here: http://www.flickr.com/photos/vitalyrepin/2433059563/

The numerical results: 

Mean    7.035
StDev    0.2041
N          15
KS        0.140
P-Value    >0.150

An now I use octave to obtain the same results.  Let me show the session and comment it.


> X=[7.11, 6.73, 6.95, 7.25, 7.25, 7.03, 7.10, 7.15, 6.78, 7.09, 7.37, 7.22, 6.82, 6.72, 6.95 ]


> mean(X)
> ans = 7.0347

OK.  Similar to the results I have got from minitab.

> std(X)
> ans = 0.2040

OK.  Similar to the results I have got from minitab.

>  [pval, ks] = kolmogorov_smirnov_test(X,
'normal',mean(X),var(X), "<>");
>  pval
>  pval = 0.92970
>  ks
>  ks = 0.54297

In minitab (see above) I have got KS=0.140 which is very different from octave value for this. 
And I still can't get why.  Do I have a crappy statistical software (minitab)?  Or do I use octave incorrectly?
Or what?

Thank you beforehand.

Good bye!


On Mon, Mar 17, 2008 at 9:11 PM, Thomas Shores <address@hidden> wrote:
Vitaly Repin wrote:
Hello!

I am really confused with kolmogorov_smirnov_test now.

I have prepared the X vector of normally distributed values using
proprietary statistical software.

And I have tried to check the distribution for normality with the help of
kolmogorov_smirnov_test function.  Let me show my octave session:


X=[7.11, 6.73, 6.95, 7.25, 7.25, 7.03, 7.10, 7.15, 6.78, 7.09, 7.37, 7.22,
6.82, 6.72, 6.95];
kolmogorov_smirnov_test(X, "normal");
pval: 1.87184e-13

kolmogorov_smirnov_test(X, "uniform");
pval: 1.87184e-13

So, the pval values are identical for uniform and normal distributions. What does it mean?

Am I using this function in correct manner?

Thank you beforehand.


 
This sample is very small, so I wouldn't place a whole lot of faith in distribution hypothesis testing.  Just for the record, if you try the following distribution, you get

kolmogorov_smirnov_test(X,'stdnormal')
pval: 1.87184e-13

The p-value is the likelihood of obtaining this sample, given the hypothesis of whatever distribution you are testing for being true. So all three distributions are essentially completely unlikely.  I haven't examined the code, but it looks like you're getting a lower bound on some computed cdf whose real lower bound is, of course,  zero.

However, you are also applying the test incorrectly, because you need to include parameters that shape the distribution in question.  Try this:

>kolmogorov_smirnov_test(X,'normal',mean(X),var(X));
pval: 0.929705
> kolmogorov_smirnov_test(X,'uniform',min(X),max(X));
pval: 0.985147
> kolmogorov_smirnov_test(studentize(X),'stdnormal');
pval: 0.929682

Now you have the flip side of the coin -- this sample very likely could have come from either distribution.  You should try a larger sample.

Thomas Shores






--
WBR & WBW, Vitaly
reply via email to

[Prev in Thread] Current Thread [Next in Thread]