[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Overlapping Regexps
From: |
Kim Hansen |
Subject: |
Re: Overlapping Regexps |
Date: |
Mon, 31 Mar 2008 16:19:23 +0200 |
On Sun, Mar 30, 2008 at 6:26 PM, Bill Denney <address@hidden> wrote:
> When running the following,
>
> frag = {"MGTGGR" "R" "GAAAAPLLVAVAALLLGAAGHLYPGEVCPGMDIR" "NNLTR" \
> "LHELENCSVIEGHLQILLMFK" "TRPEDFR" "DLSFPK" "LIMITDYLLLFR" \
> "VYGLESLK" "DLFPNLTVIR"};
> seq = strcat (frag{:});
> cuts = regexp (seq, '[KR][^P]');
>
> the result is
> cuts = [6 41 46 67 74 80 92 100],
> but I expect for cuts to also find 7. In other words, I expected
> cuts = [6 7 41 46 67 74 80 92 100].
>
> On a related note, if there is overlap in matches, is there a way to
> make regexp return the overlapping matches? For example:
>
> a = "ababababab"
> b = regexp (a, "aba")
>
> returns b = [1 5] when I would like for it to return b = [1 3 5 7].
>
> Is this a bug in my understanding of regexp or in regexp?
What you need is the "zero-width positive look-ahead assertion", it is
documented for Perl in "man perlre". I have just tested it in Octave
and it works there too (octave uses libpcre for regexpes).
Your first regexp should be: "[KR](?=[^P])" or "[KR](?!P)"
The second: "a(?=ba)"
--
Kim Hansen
Vadgårdsvej 3, 2.tv
2860 Søborg
Fastnet: 3956 2437 -- Mobil: 3091 2437