[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
mv/cp problem on SMP machines.
From: |
Michael Gaughen |
Subject: |
mv/cp problem on SMP machines. |
Date: |
Tue, 08 Jan 2002 15:17:25 -0800 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120 |
Hello,
I may have of found a somewhat obscure race when moving or
copying files on an SMP machine.
fileutils version 4.1 and fileutils 4.1.5 (latest: 1/06/2002)
Linux kernel version 2.4.7-10 (Redhat 7.2)
ext2 filesystem
The problem occurs when more than one process is attempting
to mv/cp the same file. For example:
#mkdir test
#cd test
#touch a
#ls -lia
total 8
57 drwxrwxr-x 2 mgaughen mgaughen 4096 Jan 8 14:04 .
142739 drwxrwxr-x 7 mgaughen mgaughen 4096 Jan 8 14:04 ..
59 -rw-rw-r-- 1 mgaughen mgaughen 0 Jan 8 14:04 a
#mv a b (Process 1)
#mv a b (Process 2)
To execute the moves at the _same_ time, on my SMP box, I am
using a program called hydra. It basically allows for me to send
commands to multiple login sessions at the same time.
In all cases, I would expect to see one of the process perform the
mv successfully, while the other process fails with this error msg:
mv: cannot stat `a': No such file or directory
However, when the race occurs, this message is produced instead:
mv: `a' and `b' are the same file
Executing another 'ls' gives:
#ls -lia
total 8
57 drwxrwxr-x 2 mgaughen mgaughen 4096 Jan 8 14:04 .
142739 drwxrwxr-x 7 mgaughen mgaughen 4096 Jan 8 14:04 ..
File "b" does not exist!
After looking through the mv/cp source code, I found that the
problem was in copy_internal(). When the race occurs, both
process 1 and 2 are able to stat(2) the src_path ("a"). Then,
process 1 is able to execute the rename(2) first. Process 2
comes along and also attempts the rename. However, "a" doesn't
exist anymore, so rename returns ENOENT. copy_internal() assumes
that a cross-device 'mv' is being executed, and proceeds to
unlink(2) the dst_path ("b"), followed by a call to copy_reg().
However, the call to open(2), in copy_reg(), fails since "a"
still doesn't exist. At that point the message:
mv: `a' and `b' are the same file
is printed. The message doesn't make sense in this case, because
"a" and "b" are _not_ the same file.
Part of the problem is the ugly race between stat (path lookups)
and rename under Linux (and other OSes?!?) But it seems to me
that copy_internal() could be made a bit more robust. If rename
returns ENOENT, instead of assuming that a cross-device 'mv' was
being attempted (which was not the case), copy_internal() could
print an error message and return. Is there a reason why that
would be bad?
I am not subscribed to bugs-fileutils, so if you could CC me on
any replies, that would be great.
Comments? Flames?
Thanks,
-Mike Gaughen
- mv/cp problem on SMP machines.,
Michael Gaughen <=