bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11631: Head command does not position file pointer correctly for neg


From: Anoop Sharma
Subject: bug#11631: Head command does not position file pointer correctly for negative line count
Date: Tue, 5 Jun 2012 15:56:18 +0530

Head command behaves differently with seekable and un-seekable input-data
sources.
Pipes are un-seekable. Head's behavior with input provided using pipes will
be different from its behavior when input is provided from a regular file
(which are seekable).

Therefore, the defect that was raised for regular files can not be
evaluated by examples with pipes.

Here's why head behaves differently depending on whether the file allows
lseek():
-------------------------------------------------------------------------------------------------
1. Head command first reads data into a buffer using read() system call and
then operates upon that buffer.
2. Size of the buffer used by head in read() is 8192 bytes on my machine.
3. It is not an error if read() gets lesser number of bytes than requested;
this may happen for example because fewer bytes are actually available when
read accessed the pipe or because read() was interrupted by a signal.
4. Therefore, only the upper bound of the data read in the buffer is fixed,
not the lower bound.
5. Since read() tries to read as much data as it can (upto buffer size),
therefore, in most cases it reads more data into the buffer than actually
needed by head command's algorithm.
6. When head discovers that it does not need all the data in the buffer,
then head tries to return the extra data back to the file descriptor by
using lseek().
7. However, data can not be returned back for un-seekable files. Therefore,
head has to discard extra data in for un-seekable files. This creates
situations that look as if head has eaten some part of the data.

Head's problem with unseekable files - Commands waiting to execute after
head will never get the extra data that was read by head.
Bigger Problem – How much data will be lost is not fixed because how much
data read() actually reads is not fixed (See point 4 above). It is also
possible that no data is lost!!

I tried the following example and it worked as expected:
$ seq 10 >p
$ ( head -n 2 ; echo xxx ; cat )<p
1
2
xxx
3
4
5
6
7
8
9
10
$


On Tue, Jun 5, 2012 at 3:33 PM, Voelker, Bernhard <
address@hidden> wrote:

> Anoop Sharma wrote:
>
> > Head command does not position file pointer correctly for negative line
> > count. Here is a demonstration of the problem.
>
> The problem doesn't seem to be limited to negative
> line counts. I replaced the 10 ABC lines by a number
> sequence to demonstrate this issue clearer.
>
>  $ seq 10 | ( head -n -2 ; echo xxx ; cat )
>  1
>  2
>  3
>  4
>  5
>  6
>  7
>  8
>  xxx
>  $ seq 10 | ( head -n 2 ; echo xxx ; cat )
>  1
>  2
>  xxx
>
> So head eats all of the input. The info page is silent
> about this.
>
> Have a nice day,
> Berny
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]