[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RE for any text, including white space
From: |
ken |
Subject: |
Re: RE for any text, including white space |
Date: |
Wed, 16 Mar 2011 19:43:22 -0400 |
User-agent: |
Thunderbird 2.0.0.24 (X11/20101213) |
On 03/16/2011 06:05 PM PJ Weisberg wrote:
> On Wed, Mar 16, 2011 at 2:53 PM, ken <gebser@mousecar.com> wrote:
>> On 03/16/2011 03:40 PM PJ Weisberg wrote:
>>> On 3/16/11, ken <gebser@mousecar.com> wrote:
>>>> What's the RE for any text, white space included? I also want to grab
>>>> (for match-string...) this text. The text is bounded by known
>>>> characters. E.g.,
>>>>
>>>> <h3>Any Text-- <a name="thisname">
>>>> Hot Stuff</h3
>>>> In the above, how to grab the text of the title, i.e., everything
>>>> between <h3> and </h3>? Conceivably this title text might contain
>>>> *anything* except "</[Hh]{1-9]".
>>>>
>>> If A and B are your start and end points, then you want:
>>>
>>> "A\\(.\\|\n\\)*?B"
>> That's almost it, but not quite. It grabs only the on last character
>> before the "B"; in my example above it grabs just "f". I'm needing to grab:
>>
>> "Any Text-- <a name="thisname">
>> Hot Stuff"
>>
>> -- without the quotes, of course.
>
> Well, it *matches* the whole thing; it's just that the parentheses
> only grab the last character. Put in another set of parentheses
> around the part you want to capture, and you're golden.
>
> "<h3>\\(\\(.\\|\n\\)*?\\)</h3"
>
> -PJ
Cool. That worked!! PJ, you're /The Man/.
Somewhere in the many docs on REs I read it said that you couldn't nest
match syntax-- \\(...\\) so I never tried what you did. Doing a lot of
different \\([...]*\\) kind of stuff didn't work (even with more '\'s)
at all. So this was kind of a big learn.
Thanks much,
Ken