bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12339: Gnu rm, changed only recently (4-5 years), and didn't follow


From: Eric Blake
Subject: bug#12339: Gnu rm, changed only recently (4-5 years), and didn't follow letter of posix...(statement follows)
Date: Wed, 12 Sep 2012 18:28:34 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120828 Thunderbird/15.0

On 09/12/2012 04:51 PM, Linda Walsh wrote:
> I hope to prove the subject convincingly in the following sections, If
> you can, reading this in the original HTML might be useful, as I don't
> know it will end up when converted to text.

HTML mail is forbidden on this list; the mail engine stripped it to
plain text before it ever hit the list or the online bug report:
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=12339#239

> I tried to format it for
> readability .. so if the text format isn't...(still tried to limit
> margins and use monospace font)...

So please bear with me if my inability to see your intended markup means
that I misinterpret your intent.

> Before, "rm -r bbb/" was not valid syntax --

Sorry, but 'rm -r bbb/' has ALWAYS been valid syntax in POSIX, and has
always meant 'remove the directory found by resolving 'bbb', even if
'bbb' is a symlink to a directory.  The fact that the Linux kernel
rmdir("bbb/") has not always followed POSIX in this regards makes it
harder to use as a use case - because then we are torn with whether to
honor the kernel decision of how it should behave or whether to pay the
penalty of a slower wrapper function that does it the way POSIX says it
should behave.  The best we can do is document which way we have chosen
to go.

> II.  Basenames vs. dirnames.  What are they?  Basenames are the final
> part of a name that has been chosen to name the entry located in "some
> dir".

They are also well-defined terms in the POSIX standard.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_40
http://pubs.opengroup.org/onlinepubs/9699919799/functions/basename.html
http://pubs.opengroup.org/onlinepubs/9699919799/functions/dirname.html

 Let's look at some different "Cases" of sample pathnames (skipping
> the easy ones, given the audience):
> 
> Case A:   "/"     What is the dirname and what is the basename?
> 
> Answer: <null> and <null>.

Unfortunately, your interpretation is not consistent with the POSIX
definitions.  The POSIX-mandated answer is a dirname of '/' and a
basename of '/'.  Additionally, this particular file name is always a
directory.

>  Case B: "ENTRY/"     What is the dirname and what is the basename?
> 
> Answer: Dirname="ENTRY", basename=<null>.  (Explanation seems unnecessary).

Unfortunately, your interpretation is not consistent with the POSIX
definitions.  The POSIX-mandated answer is a dirname of '.' and a
basename of 'ENTRY'.  Additionally, this particular file name must
always resolve to a directory (that is, either 'ENTRY' is a directory,
or 'ENTRY' is a symlink whose eventual target is a directory).

>  Case C: "ENTRY"    Dir and Basenames?
> 
> Answer:  it depends on context and what it really is.

Unfortunately, your interpretation is not consistent with the POSIX
definitions.  The POSIX-mandated answer is a dirname of '.' and a
basename of 'ENTRY'.  But you are correct that it may or may not be a
directory.

> 
> Whether or not 'ENTRY' is a basename or a dirname depends on whether or
> not 'ENTRY' is a directory.

Rather, the POSIX definition states that 'ENTRY' is a basename because
it is the last non-slash component of the overall string, after
stripping any trailing slashes.  Remember, a directory is ALSO a file,
just like a symlink or a socket or a character device is also a file.
The term basename applies to any file, not just non-directory files.

> 
> Example.  In it's default mode rm removes only files.

Correction - in its default mode, rm removes regular files, symlinks,
fifos, block devices, character devices, socket files, and any other
implementation extension file types, but has special case treatment of
directory files.  This special case treatment is mandated by POSIX, to
match historical practice.

> "rm a b c ENTRY d" -- rm expects all entries to to simple basenames.  If
> it encounters a dirname, it issues an error and refuses to operate on it.

Correcting your terminology:
If it encounters a *directory*, it issues an error and refuses to
operate on it.
Remember, _every single file name_ has both a dirname and a basename,
even if the dirname is the implied '.', so all five of the arguments in
your examples include a basename as part of the name.  But that does not
tell us whether the file is a directory, a regular file, or some other
type of file.

> 
> In it's recursive mode, rm will accept basenames and dirnames.  It will
> inspect each entry to determine if it is a file (basename) or a
> dir(dirname).

Again correcting your terminology:
rm will accept directories in addition to other file types.

> As POSIX states:
> 
> If, on the other hand, you specify the "-r" switch to rm, it enables it
> to remove directories, but it doesn't treat them the same as other files
> (because they still are not 'basenames'  - they are directories).

Wrong terminology, but you are correct that directories are handled
differently; but that's because unlink() and rmdir() are different
functions and so directories MUST be handled differently.

> Only after the contents have been removed is it no longer a dirpath --

This statement makes no sense in light of POSIX terminology.  Maybe you
meant:
Only after the contents have been removed is it an empty directory that
can be acted on by rmdir().

> There are 2 entries not covered usually covered by security policies, as
> they are not discretionary entries -- they are mandatory components of
> the OS, namely "." and "..".

Actually, believe it or not, '.' and '..' are NOT mandatory directory
entries in POSIX.  It is mandatory that they be handled in file name
resolution, but it is possible to have a file system where readdir()
never returns '.' or '..'.

> As POSIX is a computer portability standard, one would imagine that they
> know the difference between dirnames and basenames

Yes, POSIX has a self-consistent definition of those terms.

> and that a dirname
> can only be treated as a basename when it has been emptied of files.

No, that is not a consistent statement in light of the POSIX definition
of the terms.

-- 
Eric Blake   address@hidden    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]