help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Kolmogorov-Smirnov test 2


From: tmac017
Subject: Kolmogorov-Smirnov test 2
Date: Fri, 21 Jun 2019 18:15:10 -0500 (CDT)

I was trying to use the kolmogorov_smirnov_test_2 and I got this error

warning: kolmogorov_smirnov_test_2: cannot compute correct p-values with
ties
warning: called from
    kolmogorov_smirnov_test_2 at line 79 column 5

I saw there was another thread about this but it didn't answer the question
and that thread is closed.  Since I spent sometime looking at the code I'm
re-posting. 

The warning means that some values in each set are exactly the same. The
reason this is a problem is because the code sorts the values from both sets
and the sorted values can't occupy the same place in an ordered series. In
order to avoid an error caused by the sorting the function deletes the D
value at that point.  I don't think this should cause any problems but it
still prints a warning. 

The reason I got this error is because I was using the function
empirical_cdf to generate a cdf for each data set along the same range
because the HELP info said the function required cdf inputs.  Based on the
code it seems like the function takes in two data sets not CDFs. Because
CDFs alter the size of the set it messes with the results. 

Note: in the other thread Hamish was having a hard time using the KS-test
for 
a = randn(2000,1); 
b = randn(2000,1); 
p = kolmogorov_smirnov_test_2(a,b) 

she got the same error and the results weren't consistent. This is
ironically BECAUSE of the large set size.  The test statistic is sqrt (n_x *
n_y / (n_x + n_y)) * d.  Since the curves were randomly generated some
deviation was expected, the large sample size made the test more sensitive
to deviation, increasing the sample size just made the test even more
sensitive. 



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]