bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13498: "cut -f" lags a line


From: Scott Lamb
Subject: bug#13498: "cut -f" lags a line
Date: Sat, 19 Jan 2013 00:35:18 -0800

"cut -f" has an apparently long-standing behavior that I'd consider a
bug: it does not fully send line N to stdout until the first character
of line N+1 has been read on stdin. This is confusing when stdin comes
from "tail -f" or the like. The exact behavior varies slightly. If
stdin is a tty, all but the trailing newline will be flushed
immediately and then the trailing newline will be flushed when the
next character shows up. If stdin is not a tty, there's no flush at
all until the next character shows up.

For example, if I type the following into a shell on Ubuntu 12.04.1,
meaning cut from coreutils 8.13 and glibc package version
2.15-0ubuntu10.3:

    cut -f1-
    foo
    bar
    baz
    ^D

I will see the following:

    $ cut -f1-
    foo
    foobar

    barbaz

    baz
    $

and if I instead use "cat | cut -f1-" in the first line, I will see
the following:

    $ cat | cut -f1-
    foo
    bar
    foo
    baz
    bar
    baz
    $

(coreutils's cut -c does not have the same laggy behavior. Neither
does BSD cut on my OS X machine in either -c or -f mode.)

This code in cut_fields (still found in trunk tip) is responsible for
delaying the newline; it runs between the newline being read and being
written:

      if (c == '\n')
        {
          c = getc (stream);
          if (c != EOF)
            {
              ungetc (c, stream);
              c = '\n';
            }
        }

I believe that code is there to avoid turning one newline at EOF into
two, but that goal could be accomplished in another way.

I don't know exactly why the behavior differs based on stdin being a
tty or not. My best guess is that glibc might have some logic that, if
stdin is a tty, automatically flushes stdout any time the program
blocks on stdin. glibc's stdio internals are a bit hard for me to
follow, so I haven't found the code in question. Apparently this is a
vaguely standardized behavior; I see a stackoverflow post mentioning
the following:

"""
The input and output dynamics of interactive devices shall take place
as specified in 7.19.3. The intent of these requirements is that
unbuffered or line-buffered output appear as soon as possible, to
ensure that prompting messages actually appear prior to a program
waiting for input.

(ISO/IEC 9899:TC2 Committee Draft -- May 6, 2005, page 14).
"""

--
Scott Lamb <http://www.slamb.org/>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]