>From a805d57e1f6427b556476d33c959d3eb55286c7c Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 24 Feb 2017 00:22:51 -0500 Subject: [PATCH 1/3] doc: mention escape-sequence precedence Unescaping takes place before passing the pattern to the regex engine: $ echo 'a^c' | sed -e 's/\x5e/b/' ba^c From: https://bugs.debian.org/605142 * doc/sed.texi (Escaping Precedence): Add new subsection under 'escape sequences' with examples. --- doc/sed.texi | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/doc/sed.texi b/doc/sed.texi index b3f208b..a6846b0 100644 --- a/doc/sed.texi +++ b/doc/sed.texi @@ -3238,6 +3238,71 @@ Produces or matches a character whose hexadecimal @sc{ascii} value is @var{xx}. @samp{\b} (backspace) was omitted because of the conflict with the existing ``word boundary'' meaning. address@hidden Escaping Precedence + address@hidden processes escape sequences @emph{before} passing +the text onto the regular-expression matching of the @command{s///} command +and Address matching. Thus the follwing two commands are equivalent +(@samp{0x5e} is the hexadecimal @sc{ascii} value of the character @samp{^}): + address@hidden on address@hidden on address@hidden address@hidden +$ echo 'a^c' | sed 's/^/b/' +ba^c + +$ echo 'a^c' | sed 's/\x5e/b/' +ba^c address@hidden group address@hidden example address@hidden off address@hidden off + +As are the following (@samp{0x5b},@samp{0x5d} are the hexadecimal address@hidden values of @samp{[},@samp{]}, respectively): + address@hidden on address@hidden on address@hidden address@hidden +$ echo abc | sed 's/[a]/x/' +Xbc +$ echo abc | sed 's/\x5ba\x5d/x/' +Xbc address@hidden group address@hidden example address@hidden off address@hidden off + +However it is recommended to avoid such special characters +due to unexpected edge-cases. For example, the following +are not equivalent: + address@hidden on address@hidden on address@hidden address@hidden +$ echo 'a^c' | sed 's/\^/b/' +abc + +$ echo 'a^c' | sed 's/\\\x5e/b/' +a^c address@hidden group address@hidden example address@hidden off address@hidden off + address@hidden also: this fails in different places: address@hidden $ sed 's/[//' address@hidden sed: -e expression #1, char 5: unterminated `s' command address@hidden $ sed 's/\x5b//' address@hidden sed: -e expression #1, char 8: Invalid regular expression address@hidden address@hidden which is OK but confusing to explain why (the first address@hidden fails in compile.c:snarf_char_class while the second address@hidden is passed to the regex engine and then fails). + @node Locale Considerations @section Locale Considerations -- 2.10.2 >From de6b6ccd7400b6483ecee5eebc7d48666b497680 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 24 Feb 2017 00:49:20 -0500 Subject: [PATCH 2/3] doc: elaborate about regex matching on pattern space Regex addresses work on current pattern space, not on the original input lines. From https://bugs.debian.org/284646 * doc/sed.texi (Regexp Addresses): Add a paragraph and an example. * doc/sed.x (Addresses): Add a sentence about regexp. --- doc/sed.texi | 27 +++++++++++++++++++++++++++ doc/sed.x | 2 ++ 2 files changed, 29 insertions(+) diff --git a/doc/sed.texi b/doc/sed.texi index a6846b0..6b272ee 100644 --- a/doc/sed.texi +++ b/doc/sed.texi @@ -2265,6 +2265,33 @@ the period character does not match a new-line character in multi-line mode. @end table + address@hidden regex addresses and pattern space address@hidden regex addresses and input lines +Regex addresses operate on the content of the current +pattern space. If the pattern space is changed (for example with @code{s///} +command) the regular expression matching will operate on the changed text. + +In the following example, automatic printing is disabled with address@hidden The @code{s/2/X/} command changes lines containing address@hidden to @samp{X}. The command @code{/[0-9]/p} matches +lines with digits and prints them. +Because the second line is changed before the @code{/[0-9]/} regex, +it will not match and will not be printed: + address@hidden on address@hidden on address@hidden address@hidden +$ seq 3 | sed -n 's/2/X/ ; /[0-9]/p' +1 +3 address@hidden group address@hidden example address@hidden off address@hidden off + + @node Range Addresses @section Range Addresses diff --git a/doc/sed.x b/doc/sed.x index 401bd88..b2e0beb 100644 --- a/doc/sed.x +++ b/doc/sed.x @@ -257,6 +257,8 @@ Match the last line. .RI / regexp / Match lines matching the regular expression .IR regexp . +Matching is performed on the current pattern space, which +can be modified with commands such as ``s///''. .TP .BI \fR\e\fPc regexp c Match lines matching the regular expression -- 2.10.2 >From a36e8abccc5db38e4d2f8ea2bbb3e78dfddacd78 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 24 Feb 2017 01:18:28 -0500 Subject: [PATCH 3/3] doc: warn against misuse of -i with other options 'sed -iE' is not the same as 'sed -Ei'. 'sed -ni' is dangerous. >From https://bugs.debian.org/832088 * doc/sed.texi (Command-Line Options): Explain and add examples to '-i/--in-place' item. --- doc/sed.texi | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/doc/sed.texi b/doc/sed.texi index 6b272ee..92bff01 100644 --- a/doc/sed.texi +++ b/doc/sed.texi @@ -307,6 +307,31 @@ directory (provided the directory already exists). If no extension is supplied, the original file is overwritten without making a backup. +Because @option{-i} takes an optional argument, it should +not be followed by other short options: address@hidden @code address@hidden sed -Ei '...' FILE +Same as @option{-E -i} with no backup suffix - @file{FILE} will be +edited in-place without creating a backup. + address@hidden sed -iE '...' FILE +This is equivalent to @option{--in-place=E}, creating @file{FILEE} as backup +of @file{FILE} address@hidden table + +Be cautious of using @option{-n} with @option{-i}: the former disables +automatic printing of lines and the latter changes the file in-place +without a backup. Used carelessly (and without an explicit @code{p} command), +the output file will be empty: address@hidden on address@hidden on address@hidden +# WRONG USAGE: 'FILE' will be truncated. +sed -ni 's/foo/bar/' FILE address@hidden example address@hidden off address@hidden off + @item -l @var{N} @itemx address@hidden @opindex -l -- 2.10.2