[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#21251: sed: POSIX and the z command
From: |
Stephane Chazelas |
Subject: |
bug#21251: sed: POSIX and the z command |
Date: |
Thu, 13 Aug 2015 15:55:20 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Last one for today ;)
The GNU sed documentation has:
`z'
This command empties the content of pattern space. It is usually
the same as `s/.*//', but is more efficient and works in the
presence of invalid multibyte sequences in the input stream.
POSIX mandates that such sequences are _not_ matched by `.', so
that there is no portable way to clear `sed''s buffers in the
middle of the script in most multibyte locales (including UTF-8
locales).
The part about the POSIX requirement is not true. The behaviour
of sed on non-text input is unspecified, so it doesn't require
that . not match a byte that is not part of a valid character.
GNU sed's (or grep's for that matters) . (or [^[:alnum:]]...)
could just as well match every byte that doesn't otherwise form
part of a valid character (which would be a much better
behaviour IMO) and still be POSIX compliant.
That POSIX requirement is true for regexec() but not for text
utilities.
See that discussion on the Austin Group mailing list:
http://thread.gmane.org/gmane.comp.standards.posix.austin.general/11059/focus=11098
--
Stephane
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#21251: sed: POSIX and the z command,
Stephane Chazelas <=