[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16361: compile cache confused about file identity
From: |
Mark H Weaver |
Subject: |
bug#16361: compile cache confused about file identity |
Date: |
Wed, 01 Oct 2014 15:22:58 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
tags 16361 + notabug wontfix
close 16361
thanks
Zefram <address@hidden> writes:
> The automatic cache of compiled versions of scripts in guile-2.0.9
> identifies scripts mainly by name, and partially by mtime. This is not
> actually sufficient: it is easily misled by a pathname that refers to
> different files at different times. Test case:
>
> $ echo '(display "aaa\n")' >t13
> $ echo '(display "bbb\n")' >t14
> $ guile-2.0 t13
> ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
> ;;; or pass the --no-auto-compile argument to disable.
> ;;; compiling /home/zefram/usr/guile/t13
> ;;; compiled
> /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go
> aaa
> $ mv t14 t13
> $ guile-2.0 t13
> aaa
>
> You can see that the mtime is not fully used here: the cache is misapplied
> even if there is a delay of seconds between the creations of the two
> script files. The cache's mtime check will only notice a mismatch if
> the script currently seen under the supplied name was modified later
> than when the previous script was *compiled*.
>
> Obviously, in this test case the cache could trivially distinguish the
> two script files by looking at the inode numbers. On its own the inode
> number isn't sufficient, but exact match on device, inode number, and
> mtime would be far superior to the current behaviour, only going wrong
> in the presence of deliberate timestamp manipulation. As a bonus, if
> the cache were actually *keyed* by inode number and device, rather than
> by pathname, it would retain the caching of compilation across renamings
> of the script.
>
> Or, even better, the cache could be keyed by a cryptographic hash of the
> file contents. This would be immune even to timestamp manipulation, and
> would preserve the cached compilation even across the script being copied
> to a fresh file or being edited and reverted. This would be a cache
> worthy of the name. The only downside is the expense of computing the
> hash, but I expect this is small compared to the expense of compilation.
You could make the same complaint about 'make', 'rsync', or any number
of other programs. It's true that a cryptographic hash would be more
robust, but it would also be considerably more expensive in the common
case where the .go file is already in the cache.
I don't think it's worth paying this cost every time a .go file is
loaded, to guard against the unlikely scenario you outlined above.
The mtime check is very widely used, and accepted practice.
I'm closing this ticket.
Mark
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#16361: compile cache confused about file identity,
Mark H Weaver <=