gnunet-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GNUnet-developers] more efficient sharing of similar data


From: Martin Uecker
Subject: [GNUnet-developers] more efficient sharing of similar data
Date: Sat, 17 Aug 2002 13:25:07 +0200
User-agent: Mutt/1.4i

Hi all,

Christian suggested that I move the discussion here.

> > > Split files not at fixed boundaries but where a rolling checksum
> > > hits a certain value (see rsyncable gzip). This could greatly
> > > increase sharing of blocks from similar files, which could be
> > > very useful for some applications. I have some ideas how this idea
> > > could be modified to get pieces which are nearly 1k.
> >
> > Interesting idea. It would definitely make the encoding code *much* more
> > complicated, though. But otherwise workable. If you want to contribute code
> > I would suggest adding an additional pair of protocol numbers (and a=20
> > different root-node-type) to the code such that the existing (stable)=20
> > encoding can continue to exist while we play with this idea.
> 
> I am not sure this is necessary. Like ECC this could also be implemented
> externally, but it would make sense to integrate this into to client to
> get it actually used.
> 
> | Well, I doubt that the current on-demand encoding (given a hash,
> | find 1k in a file, encrypt that and send it back) would work in that
> | case. 

In the preprocessing case each place where the rolling checksum
hits the magic value would be moved to a 1k boundary. This scheme
would work without modifying one line of code in gnunet.

Integrating it into the client has the advantage that the user
can serve unmodified files (this is an argument for putting ECC
into the client to). But the index where the hashes are stored
must be extended with information where each block starts.
(10 bits of additional information for each 1k block)

> | Also I'm not sure how tightly you can fit all blocks into the 1k
> | scheme (which would of course be best since you gain anonymity if all
> | blocks are truely uniform in size). 

Yes, this is critical for efficiency too.

I have to think about the idea some more but I will provide
patches to play with next month.


bye,
Martin






reply via email to

[Prev in Thread] Current Thread [Next in Thread]