[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
branch master updated: website: build-vm: Tweak.
From: |
Ludovic Courtès |
Subject: |
branch master updated: website: build-vm: Tweak. |
Date: |
Mon, 11 Mar 2024 11:39:22 -0400 |
This is an automated email from the git hooks/post-receive script.
civodul pushed a commit to branch master
in repository guix-artwork.
The following commit(s) were added to refs/heads/master by this push:
new 3101f16 website: build-vm: Tweak.
3101f16 is described below
commit 3101f165670d0361ba85c218029c70d25b33b313
Author: Ludovic Courtès <ludo@gnu.org>
AuthorDate: Mon Mar 11 16:38:39 2024 +0100
website: build-vm: Tweak.
* website/drafts/build-vm.md: Improve wording and examples.
---
website/drafts/build-vm.md | 98 +++++++++++++++++++++++++++-------------------
1 file changed, 58 insertions(+), 40 deletions(-)
diff --git a/website/drafts/build-vm.md b/website/drafts/build-vm.md
index fd6aadc..f6ca6ea 100644
--- a/website/drafts/build-vm.md
+++ b/website/drafts/build-vm.md
@@ -12,7 +12,7 @@ and that does a good job at ensuring [reproducible
builds](https://reproducible-builds.org/docs/definition/), right?
Well, in hindsight, we can tell you: it’s more challenging than it
-seems. Users attempting traveling 5 years back with `guix time-machine`
+seems. Users attempting to travel 5 years back with `guix time-machine`
are (or *were*) unavoidably going to hit bumps on the road—a real
problem because that’s one of the use cases Guix aims to support well,
in particular in a [reproducible
@@ -37,9 +37,18 @@ our goal is to allow users to travel as far back as 1.0.0
and redeploy
software from there, as in this example:
```
-guix time-machine -q --commit=v1.0.0 -- install python2
+$ guix time-machine -q --commit=v1.0.0 -- \
+ environment --ad-hoc python2 -- python
+> guile: warning: failed to install locale
+Python 2.7.15 (default, Jan 1 1970, 00:00:01)
+[GCC 5.5.0] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>>
```
+(The command above uses `guix environment`, the [predecessor of `guix
+shell`](https://guix.gnu.org/en/blog/2021/from-guix-environment-to-guix-shell/),
+which didn’t exist back then.)
It’s only 5 years ago but it’s pretty much remote history on the scale
of software evolution. How well does such a command work? Well, it
depends.
@@ -54,17 +63,18 @@ quickly have your software environment at hand.
# Bumps on the build road
Things get more complicated when targeting a period in time for which
-substitutes are no longer available, such as `v1.0.0` above. (And
-really, we should assume that substitutes won’t remain available
+substitutes are no longer available, as was the case for `v1.0.0` above.
+(And really, we should assume that substitutes won’t remain available
forever: fellow NixOS hackers recently had to seriously consider
[trimming their 20-year-long history of
substitutes](https://discourse.nixos.org/t/nixos-s3-long-term-resolution-phase-1/36493)
because the costs are not sustainable.)
-The obvious first problem that arises in the absence of substitutes is
-source code unavailability. I’ll spare you the details for this
-post—that problem alone would deserve a book. Suffice to say that we’re
-lucky that we started working on [integrating Guix with Software
+Apart from the long build times, the first problem that arises in the
+absence of substitutes is source code unavailability. I’ll spare you
+the details for this post—that problem alone would deserve a book.
+Suffice to say that we’re lucky that we started working on [integrating
+Guix with Software
Heritage](https://guix.gnu.org/en/blog/2019/connecting-reproducible-deployment-to-a-long-term-source-code-archive/)
years ago, and that there has been great progress over the last couple
of years to get closer to [full package source code
@@ -87,20 +97,20 @@ address.
Among those, the most frequent problem is *time traps*: software build
processes that fail after a certain date (these are also referred to as
“time bombs” but we’ve had enough of these and would rather call for a
-ceasefire). This plagues less than 1% of the package collection, but
+ceasefire). This plagues a handful of packages out of almost 30,000 but
unfortunately we’re talking about packages deep in the dependency graph.
Here are some examples:
- - [OpenSSL](https://issues.guix.gnu.org/56137) test suite failures
- after a certain date because some of the X.509 certificates used in
- its tests have expired.
- - Similar issue with [GnuTLS](https://issues.guix.gnu.org/44559).
- Newer upstream versions rely on
+ - [OpenSSL](https://issues.guix.gnu.org/56137) unit tests fail
+ after a certain date because some of the X.509 certificates they use
+ have expired.
+ - [GnuTLS](https://issues.guix.gnu.org/44559) had similar issues;
+ newer versions rely on
[datefudge](https://packages.guix.gnu.org/packages/datefudge/) to
fake the date while running the tests and thus avoid that problem
altogether.
- - Python 2.7, found in Guix 1.0.0, had a [similar
- issue](https://issues.guix.gnu.org/65378) with its TLS-related
+ - Python 2.7, found in Guix 1.0.0, also [had that
+ problem](https://issues.guix.gnu.org/65378) with its TLS-related
tests.
- OpenJDK [would fail to build at some
point](https://issues.guix.gnu.org/68333) with this interesting
@@ -109,7 +119,7 @@ Here are some examples:
currencies is likely outdated after 10 years).
- Libgit2, a dependency of Guix, had (has?) a [time-dependent
tests](https://issues.guix.gnu.org/55326).
- - MariaDB tests [started failing after January 1st,
+ - MariaDB tests [started failing in
2019](https://issues.guix.gnu.org/34351).
Someone traveling to `v1.0.0` will hit several of these, preventing
@@ -118,7 +128,7 @@ those who’ve come to Guix from the perspective of making
their [research
workflow
reproducible](https://hpc.guix.info/blog/2023/06/a-guide-to-reproducible-research-papers/).
-Time traps are the main road block, but there’s more! Occasionally,
+Time traps are the main road block, but there’s more! In rare cases,
there’s software influenced by kernel details not controlled by the
build daemon:
@@ -162,7 +172,7 @@ a [well-defined build
environment](https://guix.gnu.org/manual/devel/en/html_node/Build-Environment-Setup.html).
This technique was
[implemented](https://archive.softwareheritage.org/browse/revision/9397cd30c8a6ffd65fc3b85985ea59ecfb72672b/)
-by Eelco Dolstra *et al.* for Nix in 2007 (with namespace support [added
+by Eelco Dolstra for Nix in 2007 (with namespace support [added
in
2012](https://archive.softwareheritage.org/browse/revision/df716c98d203ab64cdf05f9c17fdae565b7daa1c/)),
at a time where the word *container* had to do with boats and before
@@ -171,11 +181,11 @@ consists in *controlling the build environment* in every
detail (it’s at
odds with the strategy that consists in achieving reproducible builds
[*in spite* of high build environment
variability](https://tests.reproducible-builds.org/debian/index_variations.html)).
-That these are mere processes with a bunch of bind mounts makes build
-processes rather inexpensive.
+That these are mere processes with a bunch of bind mounts makes this
+approach inexpensive and appealing.
-Thus, naturally, we’d want to control the build environment’s date, and
-naturally, we turn to Linux namespaces to address that—Dolstra, Löh, and
+Realizing we’d also want to control the build environment’s date,
+we naturally turn to Linux namespaces to address that—Dolstra, Löh, and
Pierron already suggested something along these lines in the conclusion
of their [2010 *Journal of Functional Programming*
paper](https://edolstra.github.io/pubs/nixos-jfp-final.pdf). Turns out
@@ -279,7 +289,7 @@ Of course it’s possible to choose different configuration
parameters:
With a build VM with its date set to January 2020, we have been able to
rebuild Guix and its dependencies along with a bunch of packages such as
`emacs-minimal` from `v1.0.0`, overcoming all the time traps and other
-pleasant challenges described earlier. As a side effect, substitutes
+challenges described earlier. As a side effect, substitutes
are now available from `ci.guix.gnu.org` so you can even try this at
home without having to rebuild the world:
@@ -291,9 +301,8 @@ substitute: updating substitutes from
'https://ci.guix.gnu.org'... 100.0%
/gnu/store/53dnj0gmy5qxa4cbqpzq0fl2gcg55jpk-emacs-minimal-26.2
```
-For the fun of it, we went as far as
-[`v0.16.0`](https://guix.gnu.org/blog/2018/gnu-guix-and-guixsd-0.16.0-released/),
-released in December 2018:
+For the fun of it, we went as far as `v0.16.0`, [released in December
+2018](https://guix.gnu.org/blog/2018/gnu-guix-and-guixsd-0.16.0-released/):
```
guix time-machine -q --commit=v0.16.0 -- \
@@ -305,9 +314,8 @@ This is the furthest we can go since
and the underlying mechanisms that make time travel possible did not
exist before that date.
-There’s at least one case where things got more complicated as we tried
-to build packages from these revisions: in OpenSSL 1.1.1g (released
-April 2020 and packaged [in December
+There’s one “interesting” case we stumbled upon in that process: in
+OpenSSL 1.1.1g (released April 2020 and packaged [in December
2020](https://archive.softwareheritage.org/browse/revision/c4868e38289baf3a9a74bdf32166d321f7365725/)),
some of the test certificates are not valid _before_ April 2020, so the
build VM needs to have its clock set to May 2020 or thereabouts.
@@ -366,21 +374,31 @@ above:
It’s a fact that Guix so far lacks information about the date, kernel,
or CPU model that should be used to build a given package.
[Derivations](https://guix.gnu.org/manual/devel/en/html_node/Derivations.html)
-purposefully lack that information on the grounds that it’s *rarely*
-necessary—which is true, but “rarely” is not the same as “never”, as we
-saw. Should we start adding such annotations to packages?
+purposefully lack that information on the grounds that it cannot be
+enforced in user land and is *rarely* necessary—which is true, but
+“rarely” is not the same as “never”, as we saw. Should we create a
+catalog of date, CPU, and/or kernel annotations for packages found in
+past revisions? Should we define, for the long-term, an
+all-encompassing derivation format? If we did and effectively required
+virtual build machines, what would that mean from a
+[bootstrapping](https://guix.gnu.org/en/blog/tags/bootstrapping/)
+standpoint?
Here’s another option: build packages in VMs running in the year 2100,
-say, and on a baseline CPU. We don’t necessarily need to require all
-users to set up a virtual build machine, it may be enough to set up the
-project build farms so they build everything that way. This would allow
-us to catch time traps and Y2038 bugs before they bite.
+say, and on a baseline CPU. We don’t need to require all users to set
+up a virtual build machine—that would be impractical. It may be enough
+to set up the project build farms so they build everything that way.
+This would allow us to catch time traps and [year 2038
+bugs](https://en.wikipedia.org/wiki/Year_2038_problem) before they bite.
-Before we can get there the `virtual-build-machine` service needs to be
+Before we can do that, the `virtual-build-machine` service needs to be
optimized. Right now, offloading to build VMs is as heavyweight as
offloading to a separate physical build machine: data is transferred
back and forth over SSH over TCP/IP. The first step will be to run SSH
over a paravirtualized transport instead such as [`AF_VSOCK`
sockets](https://www.man7.org/linux/man-pages/man7/vsock.7.html).
-Another option would be to make the guest VM store an overlay over the
-host VM store such that inputs do not need to be transferred and copied.
+Another avenue would be to make `/gnu/store` in the guest VM an overlay
+over the host store so that inputs do not need to be transferred and
+copied.
+
+Until then, happy software archaeology!
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- branch master updated: website: build-vm: Tweak.,
Ludovic Courtès <=