bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#40242: n as delimiter alias


From: Eric Blake
Subject: bug#40242: n as delimiter alias
Date: Tue, 31 Mar 2020 08:26:01 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

On 3/31/20 2:00 AM, Oğuz wrote:
Thanks for the reply. This might not be a bug though; I sent a similar mail
(https://www.mail-archive.com/address@hidden/msg05881.html)
to Austin Group mailing list asking what's the expected behavior in this
case, and I was told (
https://www.mail-archive.com/address@hidden/msg05891.html)
both behaviors -yielding n or empty line- are correct and standard should
*probably* be amended to explicitly state that this is unspecified. And
apparently (
https://www.mail-archive.com/address@hidden/msg05893.html)
some other UNIXes adopted the same practice as GNU sed (or vice versa, I
don't know which one is older).

The POSIX folks will probably declare that use of a \X sequence (for arbitrary X; 'n', 't', '1', and probably others all fit this category) inside a regex delimited by X is unspecified behavior. But that still doesn't stop us from fixing GNU set to at least be consistent - we should either blindly declare that \X represents the special meaning of X when such a meaning is present regardless of X also being the regex delimiter (our current \n behavior - no way to represent the delimiter as a literal match), or that use of X as a delimiter renders the special meaning of \X useless for that regex (our \t behavior - no way to represent the special behavior as part of the match). My personal preference is making things consistent to our \t behavior.

In the code, the "match_slash" function [1] is used to find
the delimiters of the "s" command (typically "slashes").
Special handling happens if a slash is found [2],
And in lines 557-8 there's this conditional:

               else if (ch == 'n' && regex)
                 ch = '\n';

Which forces any "\n" to be a new-line, regardless if the
delimiter itself was an "n".


Interestingly, removing these two lines does not cause
any test failures, so this might be easy to fix without causing
any regressions.


For now I'm leaving this item open until we decide how to deal with it.

I'm thus in favor of removing that special-case of 'n'.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org






reply via email to

[Prev in Thread] Current Thread [Next in Thread]