bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gawk-stable] bug: fatal error when getline from directory


From: Eric Blake
Subject: Re: [gawk-stable] bug: fatal error when getline from directory
Date: Sun, 04 Jan 2009 07:58:13 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19) Gecko/20081209 Thunderbird/2.0.0.19 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Paolo on 1/4/2009 3:11 AM:
> I disagree, the definition of 'text file' is rather broad/vague [1]

That definition carries over into POSIX 2008, pretty much unchanged:

http://www.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html

A file that contains characters organized into zero or more lines. The
lines do not contain NUL characters and none can exceed {LINE_MAX} bytes
in length, including the <newline> character. Although POSIX.1-2008 does
not distinguish between text files and binary files (see the ISO C
standard), many utilities only produce predictable or meaningful output
when operating on text files. The standard utilities that have such
restrictions always specify "text files" in their STDIN or INPUT FILES
sections.

But there is no ambiguity - it excludes directories, in part because
directory entries contain NUL bytes (guaranteed by the wording in
http://www.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html#tag_13_08,
which states that d_name includes a terminating NUL), and in part, because
the last directory entry does not necessarily end in a newline.

> 
> $ echo -e -n '1\n1\x002\x003\x004\x00'| gawk 'BEGIN{RS="\0"}{ print "-"$0"-"}'

The fact that gawk supports this is an extension; it is not required by
POSIX.  You did not give gawk a text file in this example, but gawk went
ahead and used the final unterminated line as though a newline had been
present in the original.  (By the way, printf is more portable than echo).

> *awk operates on stdin as well, whose type is undefined

Have you ever heard of fstat?  It is easy to determine if stdin is a
directory, in which case an error can be produced.

> 
> if we don't allow for a
> dir to be treated like a line-oriented file, then getline should return -1
> like for any other error condition

You just restated my argument - the only sane way to handle directories is
to consistently make them cause an error.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAklgzoUACgkQ84KuGfSFAYBphgCguKm4hDIoeDxdxNKWD8cqD9Ex
ZcAAoMySOt8wtCZGE3qA2LeylLMjzjUS
=/Qpj
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]