[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: counting runs of a certain length in binary data
From: |
Paul Kienzle |
Subject: |
Re: counting runs of a certain length in binary data |
Date: |
Sat, 4 Mar 2000 01:19:27 +0000 (GMT) |
>On 3-Mar-2000, Mike Miller <address@hidden> wrote:
>
>| Here's a problem: I want to count the number of times in a sequence of
>| binary digits that I observe a run of N or more digits that are the same.
>| I think there must be an easy and efficient way to get Octave to pump out
>| an answer.
>|
>| For example, if this were my vector of binary digits
>|
>| [0 1 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 1 0 1 0 1]'
>| ^ # ^ # ^ # ^ #
>|
>| I would count four runs of length 2 or greater. (I've marked the
>| beginning of each run with '^' and the end with '#'.)
>|
>| I have been able to get Octave to tell me how many times I have a run of
>| length 2 or more using the code given in the third line below:
>|
>| octave:1> m=100; n=10;
>| octave:2> X=(rand(m,n)<.5);
>| octave:3> sum(diff([ones(1,n);abs(diff(X))])==-1)
>| ans =
>|
>| 24 25 27 24 25 31 23 26 25 26
>|
>| The first line tells the size of the matrix, the second line generates a
>| matrix of binary scores and the third line counts the number of runs of
>| length two or more for every column of the input matrix.
>|
>| Does anyone have any neat ways of counting runs of length greater than
>| two? I assume 'diff' would be used repeatedly somehow.
>
>Would the following work?
>
> y = [0; X; 0];
> find (diff (y) == -1) - find (diff (y) == 1)
>
>X is the vector of binary digits. The idea is that prepending and
>appending zeros ensures that the first nonzero element of diff(y) will
>be 1 and that the last nonzero element will be -1, and that there will
>be an equal number of 1 and -1 elements. Then the results of the two
>find commands should always have the same length, and their difference
>should be the run-lengths for the ones in the original vector.
>
>jwe
More generally, the following will work with runs of arbitrary values:
n = length(x);
idx = find(x(1:n-1) != x(2:n); # index where the run changes
runs = diff([0 idx n]); # convert indexes into run lengths,
# not forgetting the first or last runs
sum(runs>=N); # count the runs of length >= N
This can be expressed more compactly as:
n = length(x);
sum(diff([0 find(x(1:n-1) != x(2:n)) n])>=N)
Paul Kienzle
address@hidden
-----------------------------------------------------------------------
Octave is freely available under the terms of the GNU GPL.
Octave's home on the web: http://www.che.wisc.edu/octave/octave.html
How to fund new projects: http://www.che.wisc.edu/octave/funding.html
Subscription information: http://www.che.wisc.edu/octave/archive.html
-----------------------------------------------------------------------