help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with pdist


From: Jaroslav Hajek
Subject: Re: Problem with pdist
Date: Mon, 19 Jul 2010 13:48:05 +0200

On Thu, Jul 15, 2010 at 2:29 PM, Jaroslav Hajek <address@hidden> wrote:
> On Wed, Jul 14, 2010 at 8:09 PM, Esteban Cervetto
> <address@hidden> wrote:
>> Hello:
>>
>> I am having problems with the pdist function. It seems not to do enough
>> strong to calculate the next problem:
>>
>>
>> octave:22> pdist(X,"mahalanobis")
>>
>> error: memory exhausted or requested size too large for range of Octave's
>> index
>>
>> type -- trying to return to prompt
>>
>> octave:22> size(X)
>>
>> ans =
>>
>> 2 8993
>>
>> octave:23>
>>
>>
>>
>>
>>
>> However, Mahalanobis distance is usually used in datamining studies, and a
>> sample of 8993 registers and two variables is small.
>>
>>
>>
>> It exist a form to improve dramatically the efficiency of octave
>> with functions related with data mining or pattern classification?
>>
>>
>>
>> Links to similar topics are well received.
>>
>>
>>
>
> It seems the bottleneck here is Octave's nchoosek, which apparently
> sucks for nchoosek(1:N, 2) where N is several thousand.
> Hmmm. Perhaps something can be done there...
>

OK, I reimplemented nchoosek to loop by k rather than n and move
around less memory. It now becomes feasible to do
nchoosek (1:9000, 2); Note that the result alone takes 650 MB of
memory (in doubles) so it really isn't something small.

before:

octave:1> tic; nchoosek (1:1000, 2); toc
Elapsed time is 0.499599 seconds.
octave:2> tic; nchoosek (1:5000, 2); toc
Elapsed time is 71.9505 seconds.

now:

octave:1> tic; nchoosek (1:1000, 2); toc
Elapsed time is 0.014123 seconds.
octave:2> tic; nchoosek (1:5000, 2); toc
Elapsed time is 0.329267 seconds.

best regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]