bug-fileutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

mv/cp problem on SMP machines.


From: Michael Gaughen
Subject: mv/cp problem on SMP machines.
Date: Tue, 08 Jan 2002 15:17:25 -0800
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

Hello,

  I may have of found a somewhat obscure race when moving or
copying files on an SMP machine.

fileutils version 4.1 and fileutils 4.1.5 (latest: 1/06/2002)
Linux kernel version 2.4.7-10 (Redhat 7.2)
ext2 filesystem

 The problem occurs when more than one process is attempting
to mv/cp the same file.  For example:

 #mkdir test
 #cd test
 #touch a
 #ls -lia
 total 8
       57 drwxrwxr-x    2 mgaughen mgaughen     4096 Jan  8 14:04 .
142739 drwxrwxr-x    7 mgaughen mgaughen     4096 Jan  8 14:04 ..
          59 -rw-rw-r--    1 mgaughen mgaughen           0 Jan  8 14:04 a
 #mv a b (Process 1)
 #mv a b (Process 2)

To execute the moves at the _same_ time, on my SMP box, I am
using a program called hydra.  It basically allows for me to send
commands to multiple login sessions at the same time.

In all cases, I would expect to see one of the process perform the
mv successfully, while the other process fails with this error msg:

mv: cannot stat `a': No such file or directory

However, when the race occurs, this message is produced instead:

mv: `a' and `b' are the same file

Executing another 'ls' gives:

#ls -lia
 total 8
57 drwxrwxr-x 2 mgaughen mgaughen 4096 Jan 8 14:04 . 142739 drwxrwxr-x 7 mgaughen mgaughen 4096 Jan 8 14:04 ..

File "b" does not exist!

After looking through the mv/cp source code, I found that the
problem was in copy_internal().  When the race occurs, both
process 1 and 2 are able to stat(2) the src_path ("a").  Then,
process 1 is able to execute the rename(2) first.  Process 2
comes along and also attempts the rename.  However, "a" doesn't
exist anymore, so rename returns ENOENT.  copy_internal() assumes
that a cross-device 'mv' is being executed, and proceeds to unlink(2) the dst_path ("b"), followed by a call to copy_reg().
However, the call to open(2), in copy_reg(), fails since "a"
still doesn't exist.  At that point the message:

 mv: `a' and `b' are the same file

is printed.  The message doesn't make sense in this case, because
"a" and "b" are _not_ the same file.
Part of the problem is the ugly race between stat (path lookups)
and rename under Linux (and other OSes?!?)  But it seems to me
that copy_internal() could be made a bit more robust.  If rename
returns ENOENT, instead of assuming that a cross-device 'mv' was
being attempted (which was not the case), copy_internal() could print an error message and return. Is there a reason why that would be bad?

I am not subscribed to bugs-fileutils, so if you could CC me on
any replies, that would be great.

Comments? Flames?

Thanks,
-Mike Gaughen









reply via email to

[Prev in Thread] Current Thread [Next in Thread]