bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35181: Hydra offloads often get stuck while exporting build requisit


From: Mark H Weaver
Subject: bug#35181: Hydra offloads often get stuck while exporting build requisites
Date: Tue, 09 Apr 2019 14:09:41 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Hi Ludovic,

Ludovic Courtès <address@hidden> writes:

> The problem is that this is an ancient Guix.  In the meantime,
> offloading has seen relevant changes, in particular things like commit
> ed7b44370f71126087eb953f36aad8dc4c44109f which address stability issues
> with Guile-SSH (ssh dist node) that was previously used.
>
> I think we should upgrade Guix on hydra.gnu.org otherwise we’re likely
> to end up chasing old bugs.

Sure, that makes sense.  I also noticed the old Guix after writing my
last messages, so yesterday I tried updating Hydra's Guix to 0.16.0-11,
which at the time was the latest version built by Hydra.  After
updating, I quit and relaunched 'guix-daemon', as well as 'guix
publish', hydra-queue-runner, and hydra-evaluator.

With the new version of Guix, *all* offloads started failing in a
strange way: it got stuck in a loop, printing endlessly repeated
messages like this:

  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/1'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/2'
  process N acquired build slot '/var/guix/offload/hydra.gnunet.org/0'

This is from memory because after killing the queue-runner and
cancelling the 'mozjs-60' jobs (which I had intended to start building
as a test), the nix output above is no longer visible on those pages,
and I'm not sure offhand were to look for it.

Anyway, in every offloaded build, it printed a line like the above every
few seconds, with the build slot number at the end varying.  I don't
remember if the process number varied.

This reminds that I also ran into difficulties updating 'guix' on the
armhf build slaves, which are also currently stuck on an even more
ancient version of Guix (circa 0.12.0).

On both Hydra and its armhf build slaves, Guix is installed on top of a
Debian derivative, and both 'guix' and 'guix-daemon' are launched from
an environment without any Guix environment variable settings.  This
apparently works in ancient versions of Guix, but not recent ones.

So, could the problem simply be that the 'guix' wrapper is not
installing enough environment variable settings for offloading to work?

        Mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]