[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Backreferences in character classes?

From: Assaf Gordon
Subject: Re: Backreferences in character classes?
Date: Sat, 20 Jan 2018 16:25:55 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0


On 2018-01-19 02:25 PM, Jack Bates wrote:
Does sed support backreferences in character classes? The following doesn't work for me:

echo "'foo'\"'\"'bar'" | sed "s/\([\"']\)\([^\1]*\)\1/\2/g"
Expected: foo'bar
Actual: foo'"'"'bar

No, back-references do not work inside character classes.
This is not only in sed, but also in perl:

  $ echo "'foo'\"'\"'bar'" | perl -npe "s/([\"'])([^\1]*)\1/\2/g"

I would suggest the following:

"sed -E" enables extended regular expression, and then there's no
need to escape the parenthesis. This should be supported on all modern seds (including non-gnu). I will use -E in the examples below.

Since your character class contains only two characters (single quotes and double quotes), it is rather easy to break it down to a regular expression with alteration:

$ echo "'foo'\"'\"'bar'" | sed -E "s/(\"([^\"]*)\")|('([^']*)')/\2\4/g"

The "trick" is to replace with two back-references (\2 and \4) - one of
them is guaranteed to match (e.g. 'foo' or 'bar') and the other is guaranteed to be empty (because it belongs to the alternate regex part that didn't match).

Hope this helps,
 - assaf

reply via email to

[Prev in Thread] Current Thread [Next in Thread]