[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnupedia] Re: A Detailed Proposal - Mk I
From: |
Imran Ghory |
Subject: |
Re: [Bug-gnupedia] Re: A Detailed Proposal - Mk I |
Date: |
Sat, 20 Jan 2001 23:10:10 -0000 |
On 20 Jan 2001, at 9:33, Mike Warren wrote:
> "Imran Ghory" <address@hidden> writes:
>
> > I think that unique IDing should be done on the content of the
> > article not on data such as the server it was submitted to. After
> > all we want two articles which are the same to be fingerprinted the
> > same regardless of external facts such as which server it has been
> > submitted to.
> >
> > I think MD5 (rfc1321) fingerprinting of the article would be the
> > best way to do this.
>
> Even simpler, we could just sequentially assign 128 (or 256 or 512...)
> bit integers to the articles as they arrive at the ``central'' server,
> especially since the articles' content will change over time...
Do we really want a central point of failure ?
The first version should kept, fingerprinting the original would
encourage servers to keep older articles.
Alternately as I suggest before updates to an article could be
stored as diffs. If the diffs get too large it would be a good sign that
the article is hardly the same as the original and should be given a
new fingerprint to indicate such.
Imran Ghory