[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Null character in replacement
From: |
Bob Proulx |
Subject: |
Re: Null character in replacement |
Date: |
Thu, 23 Oct 2014 12:32:59 -0600 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
Zachary Vance wrote:
> sed -e 's/^$/\x00/' replaces empty lines with the a line containing the
> null byte.
$ echo "" | sed 's/^$/\x00/' | od -tx1 -c
0000000 00 0a
\0 \n
> sed -e "s/^$/\0/" (in bash, an s// expression containing a literal zero
> byte) performs no replacements on the stream.
It does here.
bash$ echo "" | sed "s/^$/\0/" | od -tx1 -c
0000000 0a
\n
> I am not sure if this is simply user error (including error understanding
> how bash parses strings before it reaches sed), but I thought I'd report it
> in case it was a parsing bug.
The "\0" in bash does not create a literal zero byte.
bash$ echo "s/^$/\0/" | od -tx1 -c
0000000 73 2f 5e 24 2f 5c 30 2f 0a
s / ^ $ / \ 0 / \n
Additionally you should quote the "$/" expansion for safety. Works
but isn't safe. Using single quotes to avoid $ expansion is better as
you know from your other example.
Additionally \0 is a regular expression back reference.
Your substitution works for me. With the definition of works that it
does what I expect it to do.
bash$ echo "" | sed "s/^\$/\0/" | od -tx1 -c
0000000 0a
\n
bash$ echo "" | sed 's/^$/\0/' | od -tx1 -c
0000000 0a
\n
The ^$ matches the empty line. The \0 backreference is empty because
there wasn't anything to reference and therefore the empty back
reference is used in the replacement. Perhaps these examples will
illustrate backreferences enough.
$ echo abc | sed 's/\(b\)/\1\1\1\1/'
abbbbc
$ echo abc | sed 's/.*\(b\).*/\1\1\1/'
bbb
$ echo abc | sed 's/.*\(b\).*/\0\0\0/'
abcabcabc
If you want to replace it with a literal backslash zero then you would
need to escape the backslash.
bash$ echo "" | sed 's/^$/\\0/' | od -tx1 -c
0000000 5c 30 0a
\ 0 \n
If that is done inside double quotes then there needs to be one set of
escaping for the shell expansion inside the double quotes and then
another set for sed's interpretation.
bash$ echo "" | sed "s/^\$/\\\\0/" | od -tx1 -c
0000000 5c 30 0a
\ 0 \n
Hope that helps,
Bob