coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep/sed and some strange patterns/inputs


From: Eric Blake
Subject: Re: grep/sed and some strange patterns/inputs
Date: Wed, 27 Jul 2016 09:10:03 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 07/26/2016 09:09 PM, Christoph Anton Mitterer wrote:
> Hey.
> 
> I've always had the impression that ^ and $ were the end/begin anchor
> of the current pattern, and since e.g. grep/sed work normally in terms
> of lines the start/end of lines.
> 
> What I found a bit strange is that e.g.:
> printf '' | sed 's/^/foo/'
> printf '' | sed 's/$/foo/'
> printf '' | sed 's/^$/foo/'
> doesn't produce foo and that e.g.
> printf '' | grep '^'
> printf '' | grep '$'
> printf '' | grep '^$'
> printf '' | grep '*'
> don't match.

Line-oriented programs, when presented with 0 lines on input, have 0
lines to check for and therefore nothing to output.

> 
> Why? Or better said, which part of POSIX mandates this? Or is it simply
> "no stdin, nothing happens"?

There IS stdin, just that stdin had 0 lines (it reached end-of-file
immediately).

> OTOH, other tools do operate on that:
> $ printf ''  | wc
>       0       0       0
> 
> 
> What looks IMO also a bit strange:
> $ printf 'f' | sed "s/$/foo/"
> foof

This is unspecified by POSIX.  POSIX says that sed is only required to
operate on text files, but it also states that a text file is REQUIRED
to end in a newline if it is larger than 0 bytes.  Your input doesn't
end in a newline, therefore it is not a text file, therefore the output
of sed is unspecified.  Perhaps this is a bug in quality of
implementation of the sed implementation you are using that it could do
better by outputting 'ffoo' instead of 'foof', but that's a bug report
for the sed list, not coreutils.

> $printf 'f' | sed "s/$/foo/"
> ffoo
> (with no newlines)

I'm not sure I follow your example here; you have two identical command
lines with different results.  Did you copy-and-paste one incorrectly
from what you were trying to ask?

> So they do match,
> even though there is no \n character.
> OTOH
> $ printf 'f\n' | sed
> "s/$/foo/"
> foof
> $printf 'f\n' | sed "s/$/foo/"
> ffoo

Again, your example is not clear on what you were intending to report.
But once there is a newline, then the behavior is no longer unspecified,
and you correctly get:

$ printf 'f\n' | sed "s/$/foo/"
ffoo

> give the same, so while
> there clearly is a \n now (and thus I'd assume a
> new line) it's not
> matched.
> However:
> $ printf 'f\n\n' | sed "s/^/foo/"
> foof
> foo
> here it
> matches,... so it seems it would just ignore trailing non-\n-ended
> and
> empty lines, but NOT trailing (or only) non-\n-ended but non-empty
> lines
> .

I'm not sure what you're trying to ask here, but the point is that sed
tries to match all lines, and with two \n characters in the input, you
have two lines where it can match.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]