sed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why isn't "sed -n p" identical to "cat"?


From: Assaf Gordon
Subject: Re: Why isn't "sed -n p" identical to "cat"?
Date: Wed, 9 Jan 2019 15:40:43 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0

Hello,

Interesting case!
Not sure if it's a bug or an unspecified edge-case (where
implementations can do different things).

Adding Eric Blake for his input (and POSIX pov).

On 2019-01-09 3:12 a.m., Michael Green wrote:
[...] it raises a question of where the newline is coming from in the
following command:

  printf a | sed -n 'x;p'

The hold buffer should be empty, after printing, sed doesn't output more to the same output stream so my understanding is that no
newline should be added by sed
For completeness, the same command on other implementations:

 OpenBSD 6.2, FreeBSD 11.1, Solaris 11, toybox: print nothing.

 NetBSD 7.1, BusyBox, AIX 7.2, GNU sed (since v3.02): print newline.

---

In GNU sed's implementation the reason is:
1.
the pattern buffer and the hold buffer have a boolean variable
(chomped) indicating whether a newline was removed
from the input or not.
( execute.c:struct line ).

2.
Initially, the 'chomp' is set to TRUE for both buffers.
I believe this is mostly arbitrary.
In fact, if "hold.chomp" is set to FALSE, it does not cause
any test failures, and will likely not cause any regression.
(execute.c:line_init).

3.
After reading the first input line, and before executing 'x',
The Hold buffer has empty text but chomped = TRUE (because
of default initialization); and the Pattern buffer had "a" text with chomped = FALSE (because the input did not have a newline).

Then, the 'x' command swaps the buffers, and also swaps the
value of 'chomped' (execute.c:exchange_lines).

Now the pattern buffer has "chomped = true".

4.
The 'p' command prints the current pattern buffer, and
because 'chomped' is true, it adds a new line.

---

In other implementations the reason could differ even though
the result is the same.
For example, in NetBSD 7.1 it seems sed always adds a newline,
e.g.:

  $ printf a | sed -n 'p' | od -tc
  0000000    a  \n
  0000002

So it has little to do with the 'x' command.

---

If this is deemed a bug, it is easily fixed with this patch:

diff --git a/sed/execute.c b/sed/execute.c
index 76bba4f..a97a9d1 100644
--- a/sed/execute.c
+++ b/sed/execute.c
@@ -1653,6 +1653,7 @@ process_files (struct vector *the_program, char **argv)

   line_init (&line, NULL, INITIAL_BUFFER_SIZE);
   line_init (&hold, NULL, 0);
+  hold.chomped = false;
   line_init (&buffer, NULL, 0);

   input.reset_at_next_file = true;


With a note: "pattern.chomped" should start as TRUE, as some
test depends on it - so i'm not changing the default in 'line_init'.

regards,
 - assaf












reply via email to

[Prev in Thread] Current Thread [Next in Thread]