bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51462: sed bug: ASCII NUL not handled in simple pattern


From: Assaf Gordon
Subject: bug#51462: sed bug: ASCII NUL not handled in simple pattern
Date: Sat, 30 Oct 2021 01:11:35 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0

(Adding Eric Blake for POSIX opinion)

Hello,

On 2021-10-28 11:32 a.m., Davide Brini wrote:
On Thu, 28 Oct 2021 15:25:42 +0000, Frances Wingerter <fw@immunant.com>
wrote:

Compare the output of these two sed invocations:
```
$ echo -e 'a\nb\n\0\nc\n' | sed -e '/\0/,$d'

$ echo -ne 'a\nb\n\0\nc\n' | sed -e '/\d000/,$d'

(\o000, \x00 also work). All documented here:
https://www.gnu.org/software/sed/manual/sed.html#Escapes

Whether sed maintainers want to also allow the \0 syntax, up to them of
course.

Thanks Davide for the reply.

In GNU sed, "\0" in the replacement part acts identically to "&" - referencing the whole matched portion.

This is the implemented behavior (though undocumented?) since GNU sed
version 3, released in December 1995 - so not likely to be changed.

For comparison, in BSDs "\0" acts as literal zero (ASCII 48).

Interestingly, POSIX defines a "BACKREF" as:

   [...] The character string consisting of a <backslash> character
   followed by a single-digit numeral, '1' to '9'.
( from: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_05 )

And so one could argue that this is a GNU extension that should be
disabled when used with "sed --posix".

I think we should keep "\0" undocumented to prevent proliferation of
this non-standard behavior.

regards,
 - assaf







reply via email to

[Prev in Thread] Current Thread [Next in Thread]