Re: Solstice infrastructure hackathon

From: Christopher Baines
Subject: Re: Solstice infrastructure hackathon
Date: Fri, 17 Dec 2021 01:57:39 +0000
User-agent: mu4e 1.6.6; emacs 27.2

Ludovic Courtès <> writes:

> We were unlucky enough that it happened days after the other build farm,
>, ran out of disk space and had its CI stopped,
> right before the big merge—so it doesn’t have substitutes for current
> master.

Builds effectively stopped on the 29th of November, which is more than a
few days I'd say, although this is maybe not the biggest issue. Since
the build coordinator instance behind wasn't
building things from core-updates-frozen prior to the merge, even if
builds hadn't stopped due to the space issues on bayfront, it still
wouldn't have had many substitutes.

As part of testing patches and branches [1], I think it would be good to
get builds for things like core-updates-frozen happening, that will
hopefully improve the substitute availability from
on average.


>   • Add DNS redundancy for so it can point to one of two
>     hosts (need to figure out certbot challenges so both machines can
>     update their certificates).

This (in general) is something I'm interested in working out, since
it'll be helpful for setting up mirrors for substitutes as well (in the
case where you want the mirrors to respond to one common DNS name with
working TLS).

>   • Come up with a plan to add disks to the RAID array on bayfront, the
>     head node of

The space issue on bayfront that led to builds not happening has now
been effectively resolved (see [2]). There's definitely lots of tidying
up to do, but I think the situation for storing the nars is much better


That's not to say there's not something to be gained by upgrading the
bayfront hardware, some SSD storage would be ideal to speed up the
coordinator and builds.

>   • Work on a plan to mirror nars from ci.guix and bordeaux.guix, using
>     plain rsync or <>.

I'm interested in getting in to a state where
there's less of a discrepancy in performance depending on where in the
world it's accessed from. I'm assuming there is some difference in the
performance, which is definitely an assumption to check, which is one
part of the problem. If it turns out there are some gains to be had, the
next step is investigating how this could be approached. Mirrors plus
GeoIP based DNS is the approach I currently have in mind.

Anyway, even if there isn't a meaningful performance difference, maybe
it's worth setting up distributed mirrors for reliability.

>   • Have a documented procedure to set up substitute mirrors, such as
>     the one in .cn (I can’t find the URL), ideally with plain rsync
>     access.

Getting the nar-herder in to a state where other people might be able to
use it is definitely on my list of things to do. I'm assuming here that
it's something that people might want to use, and again that's probably
worth investigating. If it turns out that people just want to use rsync,
it's probably worth assisting with getting that kind of setup working.

> Who’s in?  :-)

Not sure how much time I'll have, but I'll try to be around :)



