help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: counting runs of a certain length in binary data


From: Paul Kienzle
Subject: Re: counting runs of a certain length in binary data
Date: Sat, 4 Mar 2000 01:19:27 +0000 (GMT)

>On  3-Mar-2000, Mike Miller <address@hidden> wrote:
>
>| Here's a problem:  I want to count the number of times in a sequence of
>| binary digits that I observe a run of N or more digits that are the same.  
>| I think there must be an easy and efficient way to get Octave to pump out
>| an answer.
>| 
>| For example, if this were my vector of binary digits
>| 
>| [0 1 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 1 0 1 0 1]'
>|    ^     # ^   #       ^   # ^     #
>| 
>| I would count four runs of length 2 or greater.  (I've marked the
>| beginning of each run with '^' and the end with '#'.)
>| 
>| I have been able to get Octave to tell me how many times I have a run of
>| length 2 or more using the code given in the third line below:
>| 
>| octave:1> m=100; n=10;
>| octave:2> X=(rand(m,n)<.5);
>| octave:3> sum(diff([ones(1,n);abs(diff(X))])==-1)
>| ans =
>| 
>|   24  25  27  24  25  31  23  26  25  26
>| 
>| The first line tells the size of the matrix, the second line generates a
>| matrix of binary scores and the third line counts the number of runs of
>| length two or more for every column of the input matrix.
>| 
>| Does anyone have any neat ways of counting runs of length greater than
>| two?  I assume 'diff' would be used repeatedly somehow.
>
>Would the following work?
>
>  y = [0; X; 0];
>  find (diff (y) == -1) - find (diff (y) == 1)
>
>X is the vector of binary digits.  The idea is that prepending and
>appending zeros ensures that the first nonzero element of diff(y) will
>be 1 and that the last nonzero element will be -1, and that there will
>be an equal number of 1 and -1 elements.  Then the results of the two
>find commands should always have the same length, and their difference
>should be the run-lengths for the ones in the original vector.
>
>jwe

More generally, the following will work with runs of arbitrary values:

  n = length(x);
  idx = find(x(1:n-1) != x(2:n); # index where the run changes
  runs = diff([0 idx n]);        # convert indexes into run lengths,
                                 # not forgetting the first or last runs
  sum(runs>=N);                  # count the runs of length >= N

This can be expressed more compactly as:

  n = length(x);
  sum(diff([0 find(x(1:n-1) != x(2:n)) n])>=N)

Paul Kienzle
address@hidden



-----------------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.

Octave's home on the web:  http://www.che.wisc.edu/octave/octave.html
How to fund new projects:  http://www.che.wisc.edu/octave/funding.html
Subscription information:  http://www.che.wisc.edu/octave/archive.html
-----------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]