guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Update on bordeaux.guix.gnu.org


From: Ludovic Courtès
Subject: Re: Update on bordeaux.guix.gnu.org
Date: Sun, 28 Nov 2021 18:26:05 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hello,

Christopher Baines <mail@cbaines.net> skribis:

> I've been doing some performance tuning, submitting builds is now more
> parallelised, a source of slowness when fetching builds has been
> addressed, and one of the long queries involved in allocating builds has
> been removed, which also improved handling of the WAL (Sqlite write
> ahead log).
>
> There's also a few new features. Agents can be deactivated which means
> they won't get any builds allocated. The coordinator now checks the
> hashes of outputs which are submitted, a safeguard which I added because
> the coordinator now also supports resuming the uploads of outputs. This
> is particularly important when trying to upload large (> 1GiB) outputs
> over slow connections.
>
> I also added a new x86_64 build machine. It's a 4 core Intel NUC that I
> had sitting around, but I cleaned it up and got it building things. This
> was particularly useful as I was able to use it to retry building
> guile@3.0.7, which is extremely hard to build [2]. This was blocking
> building the channel instance derivations for x86_64-linux.
>
> 2: 
> https://data.guix.gnu.org/gnu/store/7k6s13bzbz5fd72ha1gx9rf6rrywhxzz-guile-3.0.7.drv

Neat!  (Though I wouldn’t say building Guile is “extremely hard”,
especially on x86_64.  :-))  The ability to keep retrying is much
welcome.

> On the related subject of data.guix.gnu.org (which is the source of
> derivations for bordeaux.guix.gnu.org, as well as a recipient of build
> information), there have been a couple of changes. There was some web
> crawler activity that was slowing data.guix.gnu.org down significantly,
> NGinx now has some rate limiting configuration to prevent crawlers
> abusing the service. The other change is that substitutes for the latest
> processed revision of master will be queried on a regular basis, so this
> page [3] should be roughly up to date, including for ci.guix.gnu.org.
>
> 3: 
> https://data.guix.gnu.org/repository/1/branch/master/latest-processed-revision/package-substitute-availability

That’s good news.  That also means that things like
<https://data.guix.gnu.org/repository/1/branch/master/latest-processed-revision/package-reproducibility>
should be more up-to-date, which is really cool!  This can have a
drastic impact in how we monitor and address reproducibility issues.

> Now for some not so good things:
>
> Submitting builds wasn't working quite right for around a month, one of
> the changes I made to speed things up led to some builds being
> missed. This is now fixed, and all the missed builds have been
> submitted, but this was more than 50,000 builds. This, along with all
> the channel instance derivation builds that can now proceed mean that
> there's a very large backlog of x86 and ARM builds which will probably
> take at least another week to clear. While this backlog exists,
> substitute availability for x86_64-linux will be lower than usual.

At least it’s nice to have a clear picture of which builds are missing,
how much of a backlog we have, and what needs to be rebuilt.

> Space is running out on bayfront, the machine that runs the coordinator,
> stores all the nars and build logs, and serves the substitutes. I knew
> this was probably going to be an issue, bayfront didn't have much space
> to begin with, but I had hoped I'd be further forward in developing some
> way to allow moving the nars around between multiple machines, to remove
> the need to store all of them on bayfront. I have got a plan, there's
> some ideas I mentioned back in February [4], but I haven't got around to
> implementing anything yet. The disk space usage trend is pretty much
> linear, so if things continue without any change, I think it will be
> necessary to pause the agents within a month, to avoid filling up
> bayfront entirely.

Ah, bummer.  I hope we can find a solution one way or another.
Certainly we could replicate nars on another machine with more disk,
possibly buying the necessary hardware with the project funds.

Thanks for the update!

Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]