bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50701: cannot append or insert to empty file or stream


From: Eric Blake
Subject: bug#50701: cannot append or insert to empty file or stream
Date: Thu, 23 Sep 2021 13:59:36 -0500
User-agent: NeoMutt/20210205-773-8890a5

On Mon, Sep 20, 2021 at 01:28:42PM -0600, Assaf Gordon wrote:

> First, let's clarify what the files are:
> 
> >     $ echo -n >file line-1
> >     $ touch empty
> 
> The file 'file' is not empty, it has one line.

That's one way of viewing it.  But according to POSIX, it doesn't even
have that (the POSIX definition of a "line" is characters followed by
a newline, and since you omitted the newline, it is not a line).

> This line just happens to be empty (i.e. no characters in the line before
> the newline character).
> 
> The file 'empty' is empty, it contains NO lines.
>

...

> 
> > this is an extremely surprising behavior that limits the utility of sed
> > when one cannot predict the contents of the file in question.
> 
> I hope the explanation above makes this behavior less surprising.

However, note that POSIX also says that 'sed' only has specified
behavior for "text files".  And the POSIX definition of a text file
specifically excludes files that do not end in a newline, other than
the empty file [1].  Thus, attempting to use sed on the file 'file'
which does not end in a newline, and is therefore not a text file, is
undefined behavior, and ANYTHING can happen.

[1] 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_403
(Other ways to have a file that is not a text file: have a NUL
character, have more characters than LINE_MAX bytes between newlines,
or have an encoding error in the current multibyte locale - but those
are not relevant to this conversation.)

But just because POSIX doesn't say what to do does not mean that we
can't pick something useful.  GNU sed tries hard to have sane behavior
for "mostly-text" files, such as the case where you forgot a trailing
newline.  It does so by pretending that there was a newline after
those last characters, after all.  But not all sed implementations
behave the same on your example.

> 
> As such, I'm marking this as "not a bug", but discussion can continue
> by replying to this thread.

At any rate, I agree that it's not a bug.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org






reply via email to

[Prev in Thread] Current Thread [Next in Thread]