bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null character in replacement


From: Bob Proulx
Subject: Re: Null character in replacement
Date: Thu, 23 Oct 2014 12:32:59 -0600
User-agent: Mutt/1.5.23 (2014-03-12)

Zachary Vance wrote:
> sed -e 's/^$/\x00/' replaces empty lines with the a line containing the
> null byte.

$ echo "" | sed 's/^$/\x00/' | od -tx1 -c
0000000  00  0a
         \0  \n

> sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
> byte) performs no replacements on the stream.

It does here.

bash$ echo "" | sed "s/^$/\0/" | od -tx1 -c
0000000  0a
         \n

> I am not sure if this is simply user error (including error understanding
> how bash parses strings before it reaches sed), but I thought I'd report it
> in case it was a parsing bug.

The "\0" in bash does not create a literal zero byte.

bash$ echo "s/^$/\0/" | od -tx1 -c
0000000  73  2f  5e  24  2f  5c  30  2f  0a
          s   /   ^   $   /   \   0   /  \n

Additionally you should quote the "$/" expansion for safety.  Works
but isn't safe.  Using single quotes to avoid $ expansion is better as
you know from your other example.

Additionally \0 is a regular expression back reference.

Your substitution works for me.  With the definition of works that it
does what I expect it to do.

bash$ echo "" | sed "s/^\$/\0/" | od -tx1 -c
0000000  0a
         \n

bash$ echo "" | sed 's/^$/\0/' | od -tx1 -c
0000000  0a
         \n

The ^$ matches the empty line.  The \0 backreference is empty because
there wasn't anything to reference and therefore the empty back
reference is used in the replacement.  Perhaps these examples will
illustrate backreferences enough.

$ echo abc | sed 's/\(b\)/\1\1\1\1/'
abbbbc

$ echo abc | sed 's/.*\(b\).*/\1\1\1/'
bbb

$ echo abc | sed 's/.*\(b\).*/\0\0\0/'
abcabcabc

If you want to replace it with a literal backslash zero then you would
need to escape the backslash.

bash$ echo "" | sed 's/^$/\\0/' | od -tx1 -c
0000000  5c  30  0a
          \   0  \n

If that is done inside double quotes then there needs to be one set of
escaping for the shell expansion inside the double quotes and then
another set for sed's interpretation.

bash$ echo "" | sed "s/^\$/\\\\0/" | od -tx1 -c
0000000  5c  30  0a
          \   0  \n

Hope that helps,
Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]