bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

File position after fclose on a buffered input stream


From: Neal H Walfield
Subject: File position after fclose on a buffered input stream
Date: 05 Feb 2002 20:02:36 -0500
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1

The file position after an fclose on an input stream is left at the
end of the buffered data and not where the caller thinks he left it.
Normally, this will not matter, however, this behavior becomes
important, as suggested by David Korn on the austin group mailing
list, when we consider programs using a shared file descriptor as can
be done when using the shell.  Consider the following chunk of code,
which logically, would print 4999 (the subshell being used to skip the
first line):

        address@hidden:~/src/textutils-2.0.20 (0)$ uname -a
        Linux desdemona 2.2.17 #4 SMP Sat Sep 16 21:51:08 EST 2000 i686 unknown
        address@hidden:~/src/textutils-2.0.20 (0)$ perl -e \
        > 'print "test\n" x 5000' | (src/head -n1 >/dev/null; cat) | wc -l
            4181

Instead of 4999, we get, instead, some random number.  What has
happened is that due to buffering in head cat starts reading long
after our intended position.

This problem is not restricted to glibc or the GNU text utilities.
Similar behavior can be seen on Tru64 (using their implementation of
head):

        address@hidden:~$ uname -a
        OSF1 saturn.cs.uml.edu V5.0 1094 alpha
        address@hidden:~$ perl -e 'print "test\n" x 5000' | \
        > (head -n1 >/dev/null; cat) | wc -l
            3362

And SunOS:

        address@hidden:~$ uname -a
        SunOS force1 4.1.3_U1 2 sun4c
        address@hidden:~$ perl -e 'print "test\n" x 5000' | \
        > /usr/ucb/head -1 >/dev/null; cat) | wc -l
            4181


The standards do not have too much say about this behavior.

For instance, the third version of the Single Unix Specification in
its description of fclose says nothing about how the file position is
to be left:

        The fclose() function shall cause the stream pointed to by
        stream to be flushed and the associated file to be closed. Any
        unwritten buffered data for the stream shall be written to the
        file; any unread buffered data shall be discarded. Whether or
        not the call succeeds, the stream shall be disassociated from
        the file and any buffer set by the setbuf() or setvbuf()
        function shall be disassociated from the stream. If the
        associated buffer was automatically allocated, it shall be
        deallocated.

And, according to the same standard, flushing an input steam (using
fflush) is undefined:

        If stream points to an output stream or an update stream in
        which the most recent operation was not input, fflush() shall
        cause any unwritten data for that stream to be written to the
        file, [CX] [[Option Start]] and the st_ctime and st_mtime
        fields of the underlying file shall be marked for
        update. [[Option End]]

However, in the rational section for fflush, this case is described:

        Data buffered by the system may make determining the validity
        of the position of the current file descriptor
        impractical. Thus, enforcing the repositioning of the file
        descriptor after fflush() on streams open for read() is not
        mandated by IEEE Std 1003.1-2001.


This means that glibc does not have to do anything about this,
however, after a glance at the libio code, it seems to me that it
would be possible to reposition the file position in _IO_new_fclose.

Yet even this change will not make head react correctly in all
situations -- as already mentioned, at least SunOS and Tru64 leave the
file position of an input stream at the end of the buffered data.
Therefore to be completely robust, text utils would need to be
changed.  My impression is that a call to fsetpos would not help,
however, using unbuffered input, e.g. calling setvbuf before starting
to read from the stream, although slower, would.  If this change is
desirable, I would be happy to discuss it a bit more and then
implement it.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]