arx-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Arx-users] Re: [Gnu-arch-users] RFC: arch protocol, smart server, and t


From: Aaron Bentley
Subject: [Arx-users] Re: [Gnu-arch-users] RFC: arch protocol, smart server, and tla implementation prototypes
Date: 30 Jan 2004 17:02:25 -0500

On Fri, 2004-01-30 at 15:46, Tom Lord wrote:
>     > From: Colin Walters <address@hidden>
> 
>     >>  (2) design your protocols so they can be streamed
>     >> (instead of a series of short operation-replies, allow a bunch of 
> operations
>     >> to be sent as a batch, with well-defined behavior in the case of 
> mid-batch
>     >> failure).
> 
>     > What commands specifically do you see being used in this
>     > context?  The only one that comes to mind at the moment is
>     > making 'abrowse' more efficient.  Maybe more will arise though
>     > when I get to comitting and such.
> 
> 
> I'm no so sure that general purpose streaming is a particularly useful
> strategy for most arch operations.
> 
> In general, streaming is going to require the client-side application
> logic to generate lots of requests before consuming the data they
> return.  That's awkward, client-side, and there's only a few special
> cases where it would be worthwhile.  More importantly: I think that
> those special cases will most often have better solutions than
> (literal) streaming.

I disagree: solving the latency problem for archd is nice, but solving
it for all protocols is nicer.  And I think we'll I'll get close to that
functionality in the backwards-builder anyhow.

Making build_revision work backwards looks relatively easy.  Just find
out how many revisions away you have a library or a cacherev, and build
in that direction.

Crossing tag boundaries will make the problem harder.  While tla
implicitly uses the call stack to build forwards, crossing tag
boundaries requires tla to map out several paths, and determining the
best one will require a cost assessment based on

1. aggregate download size
2. download cost for a given archive

Once you know all this, it's not much of a leap to specify the exact
revsions desired.  Then they can be retrieved through streaming,
multiple connections, etc.

(Of course, I intend to crawl before I walk)

The archd pfs can merge the requests for "patch-1, patch-2, patch-3"
into "delta patch-1 patch-3" with little difficulty.  Since pfs-archd
will be alone in supporting this functionality, it makes sense to
special-case for archd instead of special-casing the current supported
protocols.

The other advantage of allowing high performance with dumb servers is
that smart servers may run out of CPU time and I/O bandwidth before they
run out of network throughput.

>     If from-revision is not * or
>     is not the immediate ancestor of to-revision, then implementations
>     MAY instead return an error.

I don't believe this provision is required.  Instead:

>     If from-revision is not * or is not the immediate ancestor of to-revision,
>     the server may return more than 
>     one changeset.   The composition of the changesets returned
>     describes the differences between the two revisions.

Hmm.  Looks awfully like streaming to me.

 
>     The client MAY include a Parts-limit header containing a single,
>     postivie integer.   The server MUST NOT reply with a greater
>     number of changesets than that.

I don't understand the motivation here.  Is it to avoid biting off more
than you can chew, bandwidth or storage-wise?  (If so, wouldn't an
Aggregate-size-limit header be better?)

> Client-side, this can be provided by new function added to the
> archive.h vtable.
> 
>       arch_archive_delta ([...])
> 
> Unlike the wire protocol, arch_archive_delta will always return a
> _single_ changeset.
> 
> We'll need a (client-side) function:
> 
>       arch_compose_changesets
> 
> which can compose two changesets if they have the property that after
> the successful application of the first to any tree, the second will
> cleanly apply as well.
> 
> archive-walter.c can use arch_compose_changesets to assemble the
> "parts" that it gets from the server.

> Minimally, archive-pfs.c can always return 0 (no changeset) for
> arch_archive_delta.

pfs-sftp, pfs-http, pfs-dav etc. can use arch_compose_changesets plus
streaming or multiple simultaneous connections to implement
arch_archive_delta.

I'm not sure that there's any value in composing the changesets before
applying them, though.  It would probably be better to say
arch_archive_delta can return any number of changesets, and apply them
directly.

> The strategy used by `update' can be modified to attempt to use
> arch_archive_delta in preference to a `replay'-like strategy of 
> reading several changesets separately.

Yes, the update code actually has provisions for applying a delta right
now.

> Other merge commands can take good advantage of
> arch_compose_changesets as well.

If you're thinking of tla delta, I suspect that compose-changesets will
be implemented in terms of replay and make-changeset, not the other way
around.

> The advantage of this approach over streaming is that it can be
> implemented in two ways (or a mix of two ways): Changeset composition
> can take place either client-side or server-side.

But why would we want to merge changesets on the client side before
applying them?

> In
> both cases, it seems useful to me to make ambiguous whether certain
> computations take place server-side or client-side and so (literal)
> streaming is not the right answer.

I think that archd might well be implemented at a higher level.  It
could conceivably use sftp, http, etc as backends.

Aaron

-- 
Aaron Bentley
Director of Technology
PanoMetrics, Inc.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]