Shuffling elements in a dataset (fwd)

help-octave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Shuffling elements in a dataset (fwd)

From:	Ted Harding
Subject:	Shuffling elements in a dataset (fwd)
Date:	Fri, 21 Feb 1997 23:54:21 +0000 (GMT)

( Re Message From: address@hidden )
> 
> I have the need to randomize the order (shuffle) of very large 
> datasets. The way I devise, randonly sampling with elimination, is 
> not very efficient. Is there a better way, using octave's matrix 
> manipulation? 
> 
> My way:
> 
>     nm = num = rows(data);
> 
>     for i=1:num
>         rn = ceil(rand * nm--);
>         new_data(i,:) = data(rn,:);
>         data(rn,:) = [];
>     endfor
> 
> Better way: perhaps creating a vector of unique indexes? but how to 
> do this? 
> 
>     idx = 1:rows(data);
>     now shuffle idx
>     new_data = data(idx,:)
> 
> Of course, this it is the same problem in one dimension...

I find that something like

   [dummy,ix] = sort(rand(1,rows(x)));  new_x = x(ix,:);

seems pretty fast. (0.04 secs for 10000 rows, 0.05 secs for 100000 rows,
or for 1000000, on a 386-DX/25MHz; 0.003 secs for 10000 rows, 0.004 secs
for 100000 rows, or for 1000000, on Pentium-120, i.e. almost independent
of number of rows. However for 10000000 rows it starts swapping and takes
a while (48 MB RAM)). Above timings for 1 column only; reduce sizes pro
rata for extra columns (RAM limit).

Ted.                                    (address@hidden)

[Prev in Thread]

Current Thread

[Next in Thread]

Shuffling elements in a dataset (fwd), Ted Harding <=
- Shuffling elements in a dataset (fwd), John W. Eaton, 1997/02/21
- Shuffling elements in a dataset (fwd), Ted Harding, 1997/02/22

Prev by Date: Re: Shuffling elements in a dataset
Next by Date: Shuffling elements in a dataset (fwd)
Previous by thread: Shuffling elements in a dataset
Next by thread: Shuffling elements in a dataset (fwd)
Index(es):
- Date
- Thread