[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
unexpected match of s command regexp at ^
From: |
Christoph Anton Mitterer |
Subject: |
unexpected match of s command regexp at ^ |
Date: |
Tue, 28 Sep 2021 03:20:51 +0200 |
User-agent: |
Evolution 3.38.3-1 |
Hey.
I don't quite understand why the following behaves as it does:
The general idea was that I have a string where multiple key=value
pairs or singleOptions are separated by "," and any number of
consecutive "," are allowed before/after such words.
What I wanted to do was, check for unknown options, e.g. by doing
something like this in an s/// command:
for readability:
\(
\(^\|,\+\)
\(
\(foo\|bar\|baz\)=[^,]*
\|
single
\)
\)*
A: s/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*//
if that would leave over just zero or more "," only valid options would
have been used.
I though a better version of (A) would be:
B: s/^\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*,*$//
It's anchored in the very beginning (before the outer \( ... \)* and
already removes any trailing "," till the anchor in the end.
Thinking about (B) I tried a bit more around with (A) and noticed the
following which I cannot explain (and hope someone here knows why):
printf '%s\n' 't,,,,,,,,single,,,bar=value,foo=,,,,' | \
sed 's/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*//'
(that's (A))
I'd have expected that this give me:
t,,,,
That is: the "t" in the beginning and the final ",,,,", however it
doesn't, instead it gives:
t,,,,,,,,single,,,bar=value,foo=,,,,
It does though when I use:
C: s/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)//g
Eventually I tried:
A_: s/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*//
(that's (A) but replacing to "_")
printf '%s\n' 't,,,,,,,,single,,,bar=value,foo=,,,,' | \
sed 's/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*//'
and I saw that this yields:
_t,,,,,,,,single,,,bar=value,foo=,,,,
That kinda explains why ",,,,,,,,single,,,bar=value,foo=" isn't
removed, cause it matches in the beginning and then the "t" interrupts
the * operator,... which is where s///g is different in behaviour, I
assume)
But why on earth does it (A / A_) match in the beginning?
Interestingly it "works" again when anchoring it in the end:
D: s/\(\(^\|,\+\)\(\(foo\|bar\|baz\)=[^,]*\|single\)\)*,*$//
which gives:
t
Thanks,
Chris.
- unexpected match of s command regexp at ^,
Christoph Anton Mitterer <=