guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#52555] [RFC PATCH 0/3] Decentralized substitute distribution with E


From: pukkamustard
Subject: [bug#52555] [RFC PATCH 0/3] Decentralized substitute distribution with ERIS
Date: Thu, 23 Dec 2021 11:42:46 +0000

Hi Ludo,

Thanks for your comments!

Ludovic Courtès <ludo@gnu.org> writes:

>> StorePath: /gnu/store/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10
>> URL: nar/gzip/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10
>> Compression: gzip
>> FileSize: 67363
>> ERIS: 
>> urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDDVDAGJUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE
>> URL: nar/zstd/81bdcd5x4v50i28h98bfkvvkx9cky63w-hello-2.10
>> Compression: zstd
>> FileSize: 64917
>> ERIS: 
>> urn:erisx2:BIBO7KS7SAWHDNC43DVILOSQ3F3SRRHEV6YPLDCSZ7MMD6LZVCHQMEQ6FUBTJAPSNFF7XR5XPTP4OQ72OPABNEO7UYBUN42O46ARKHBTGM
>
> Do we really need one URN per compression method?  Couldn’t we leave
> compression (of individual chunks, possibly) as a “detail” handled by
> the encoding or the transport layer?
>

I agree that it would be nice to leave this to the encoding layer as
that would allow certain optimizations (e.g. de-duplication).

Unfortunately, we haven't figured out yet what the most suitable
compression/format would be. Something like EROSFS seems good (as it
aligns data to fixed block sizes) [1]. But this seems a bit "clunky" for
just an archive format and there do not seem to be any libraries that we
could use to neatly integrate. It seems possible to block-align a Tar
archive, but that seems a bit hackey [2]. Other things to look into
might be Tarlz [3] and ZPAQ [4].

To get started I suggest just using one of the compressions/formats
already in Guix. zstd seems to be a reasonable choice (for the same
reasons why it makes sense to use zstd with `--discover` [5]).

Does that sound like a plan?

[1] https://inqlab.net/git/guile-eris.git/tree/examples/dedup-fs/Readme.org
[2] 
https://unix.stackexchange.com/questions/276908/make-tar-or-other-archive-with-data-block-aligned-like-in-original-files-for/279384#279384
[3] http://lzip.nongnu.org/tarlz.html
[4] http://mattmahoney.net/dc/zpaq.html
[5] https://guix.gnu.org/en/blog/2021/getting-bytes-to-disk-more-quickly/

>> If the `--ipfs` is used for `guix publish` then the encoded blocks are also
>> uploaded to the IPFS daemon. The nar could then be retrieved from anywhere 
>> like
>> this:
>>
>> (use-modules (eris)
>>           (eris blocks ipfs))
>>
>> (eris-decode->bytevector
>>  
>> "urn:erisx2:BIBC2LUTIQH43S2KRIAV7TBXNUUVPZTMV6KFA2M7AL5V6FNE77VNUDDVDAGJUEEAFATVO2QQT67SMOPTO3LGWCJFU7BZVCF5VXEQQW25BE"
>>  eris-blocks-ipfs-ref)
>>
>> These patches do not yet retrieve content from IPFS (TODO). But in principle,
>> anybody connected to IPFS can get the nar with the ERIS URN. This could be 
>> used
>> to reduce load on substitute server as they would only need to publish the 
>> ERIS
>> URN directly - substitutes could be delivered much more peer-to-peer.
>
> Nice.  So adjusting ‘guix substitute’ should be relatively easy?

Yes, relatively! :)

I meant to send in a V2 that does this before going on holidays, but I'm
afraid I won't make it. V2 will come in early January!

>> Other transports that I have been looking in to and am pretty sure will work
>> include: HTTP (with RFC 2169 [3]), GNUNet, OpenDHT. This is, imho, the
>> advantage of ERIS over IPFS directly or GNUNet directly. The encoding and
>> identifiers (URN) are abstracted away from specific transports (and also
>> applications). ERIS is almost exactly the same encoding as used in GNUNet
>> (ECRS).
>
> As a first step, ‘guix publish’ could implement RFC 2169, too.
>
> I gather implementing the HTTP and IPFS backends in ‘guix substitute’
> should be relatively easy, right?

Yes, those seem to be the two easiest backends to implement.

>> A tricky things is figuring out how to multiplex all these different
>> transports and storages...
>
> Yes.  We don’t know yet what performance and data availability will be
> like on IPFS, for instance, so it’s important for users to be able to
> set priorities.  It’s also important to gracefully fall back to direct
> HTTP downloads when fancier p2p methods fail, regardless of how they
> fail.

Agree.

Thanks,
-pukkamustard






reply via email to

[Prev in Thread] Current Thread [Next in Thread]