guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Parallel guix builds can trample?


From: Phil
Subject: Parallel guix builds can trample?
Date: Tue, 11 Jan 2022 21:26:31 +0000
User-agent: mu4e 1.4.15; emacs 27.2

Hi all,

I'm seeing strange issues when running many "guix build" commands of the
same package simultaneously on the same linux account/server at the same
time.

Tracing this requires a lot of detective work, so for now I've omitted
the logs - my initial question is - am I doing something which is
obviously not going to work with Guix, or would people expect the below
use-case to work?

In my example I have ~12 clones of my private guix channel, sat in their
own directories under my user account.

I update the same single package on each channel locally with a
different git commit id and package version.  So 12 variations of the
same package, accessing different commit ids, in the same package's
source repo, and having different package versions in Guix.

I then call "guix build -K -L /path/to/each/local/clone package-name" 12
times, one for each clone, these run in parallel shell sessions.

This is all automated so each build is started within a fraction of
second - trying to reproduce this issue by hand is proving difficult,
and the issue is still sporadic in the automated system.

The logs all show that each of the ~12 channels receive a local update
with a unique and valid commit id and version for the changed package.

However what I'm seeing is that some of the builds are failing with the
error messages of other commit ids!

To be clear - the failure is not the surprise here, the builds are
getting mixed up and some are being trampled by what appears to be
either a race condition or stale state.  The surprise is that the
failures are being tied to the wrong commit ids which do not contain the
failures reported.

Use of "--keep-failed" means I can prove this unequivocally, where the
saved down /tmp/package-name.verson.drv.0 source code does not match the
source of the commit id stated within the updated package - which I can
see in the logs.  I can also show that the source code matches exactly
one of the other commit ids for one of the other clones, which was
expected to fail.

This happens sporadically, but we can reproduce the issue several times
a day.

I don't yet have 100% proof that the issue is happening inside Guix, but
I've ruled out most (but not all) other causes so far, and the
--keep-failed evidence tied with logs showing correct inputs is quite telling.

Have any other bugs ever been reported like this that people are aware
of?  Have other people ran multiple builds in Guix under one account at
the same time without issue?  Any advice on how trap the issue, given
it's hard to reproduce?

The problem as never been seen when we do each guix build in serial.

Apologies for the long and rather circumstantial e-mail!

Cheers,
Phil.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]