Re: [bug-recutils] GSoC: Ideas for Recutils

From: Jose E. Marchesi
Subject: Re: [bug-recutils] GSoC: Ideas for Recutils
Date: Sat, 31 Mar 2012 11:12:44 +0200
    >     The ideas page mentions determining if the index is up to date, I 
    >     see other practical solutions than using filesystem metadata of the
    >     database file (checksumming the file contents should be much slower 
    >     doing a simple query using a tree index).
    > We could have a "checksum" comment at the end of the rec file, which
    > would be generated by recfix when creating the index file.  The problem
    > with this approach is that the creation of the index wont be completely
    > decoupled from the recfile itself, but that may not be really a problem.
    Another problem is that users editing recfiles with a text editor might
    forget to change the comment, leading to incorrect query results (while
    other solutions would give correct results slowly).

Hm, yes.  Using hashes in comments is not a good idea.

    Python and Mercurial uses modification timestamps to avoid reading
    files (modules to compile into cached bytecode or versioned files to
    diff), I haven't observed any problems with reliability of this

What about using a heuristic based on several properties of the file
such as its size, last modification date, etc?  That could provide
a rate of success big enough to be a practical solution.

