bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] cp --update --preserve


From: Paolo Montrasio
Subject: Re: [PATCH] cp --update --preserve
Date: Thu, 15 Jan 2004 12:20:39 +0100

With this message I'm answering to both Jim and Paul at the same
time because I don't want to split the thread.

The patch proposed by Jim seems equivalent to my one, but
1) I might be wrong and
2) his patch is coded more tightly, which is something I appreciate, so
I don't complain :-)

Jim wrote:
> 2) In any case, I'm not sure that such a change would be a good idea,
> since it does subvert the semantics of --update.  But I'm open
> to arguments either way.

I imagine you're writing so because the patch ignores the nanosecond
component of the timestamp, which could be bad if cp is running
on a file system that correctly handles it.
Probably Paul's second suggestion solves this problem, but I don't
think that it can be implemented as he suggests (or I'm missing
something).

Paul proposes to examine the actual time resolution of the filesystem
when copying the file. However I think that we should do it when checking
the need to update the destination file, that is before copying it.
The reason is that when I'm updating an existing file I don't want
to create a second one (even if an empty one) before realizing that the
old one is still up to date.

This leaves us with the problem of when we should perform
the check.

First of all, you should read this message in the FreeBSD mailing list
http://www.freebsd.org/cgi/query-pr.cgi?pr=47168
The message is the reply to someone complaing that the nanoseconds
are set to 0 by most system calls. It includes some caveats:

> - some filesystems can't support nsec resolution.
> - some filesystems that could support it don't.
> - copying and archiving utilities can't support full nsec resolution, since
>   there is no syscall to set it.  utimes() sets times in usec.
> - some copying and archiving utilities that could support usec resolution
>   don't.

This also matches my experience :-) and utime(2) manual.
Reading what's returned by
http://www.google.com/search?q=utime+nanoseconds is quite
interesting too.

Given those considerations I can think about two ways to solve our
problem. One at compile time and one at run time.

The compile time solution is to embed a check for the time resolution
of the file system into configure and build cp accordingly. However
this works only if you're using cp on one system and on one filesystem.
Furthermore you can't  build binary distributions that work reliably for
everybody. This is bad.

So, a proper check should be done at run time on the destination file
system before starting to copy the file. Paul's suggestion can be
modified in this way:

1) read the timestamp of the existing destination file (if it doesn't
exist yet we don't have any problem and we can resume normal
processing)
2) write the current time to destination file's atime
3) find out how many significative digits there are in atime
4) restore the original timestamp
5) perform the check according to the number of significative digits

Example:

On my system the check at 3 would find that there a no significative digits
and it won't check the nanoseconds.
On another system that supports a time resolution of multiples of
100 nanoseconds, the check wouldn't take in account the last two digits
in the timestamp.

There are a couple of problems with this approach

1) If for some reason cp terminates between steps 2 and 4 we're left with
a destination file that has a changed atime. That's a nuisance but
the probability that it happens is really slim (a kill sent to cp being the
largest one I can think of) and it doesn't leave the file in a totally incorrect
status (after all we did access the file).
2) The smaller the resolution, the higher the chances are that you get some
0 in the least significative digits of the timestamp. For instance, if you have
full nanoseconds resolution, the program will believe in the 10% of cases
that you have a resolution of 10 nanoseconds. Luckily, you probably don't
perform two consecutive accesses to the same file in those few
nanoseconds.

So, maybe we have a way to fix the behaviour of cp and preserve its
semantics.

By the way, now I have a patched version of cp on my system.
I called it cp2 and I'm using it only for my backup script :-)

Paolo
--
Play Go! http://www.figg.org





reply via email to

[Prev in Thread] Current Thread [Next in Thread]