Re: Linux-libre 5.8 and beyond

guix-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Linux-libre 5.8 and beyond

From:	Mark H Weaver
Subject:	Re: Linux-libre 5.8 and beyond
Date:	Sat, 15 Aug 2020 02:03:27 -0400

Hi Alexandre,

Alexandre Oliva <lxoliva@fsfla.org> wrote:
> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote:
>
>>>> It may be useful for users with newer hardware devices, which are
>>>> not yet well supported by the latest stable release, to use an
>>>> arbitrary commit from either Linus' mainline git repository or some
>>>> other subsystem tree.
>>> 
>>> The cleaning up scripts are version-specific and won't work on an
>>> "arbitrary commit from Linus's mainline git repository" (i.e., someone
>>> wanting to get today's most recent commit going into 5.9.) The scripts
>>> would fall over and die in such a scenario,
>
>> Okay, perhaps this was wishful thinking on my part.
>
> Yup.  If you ran a deblob-check in verify mode on the resulting
> tarballs, you'd see how error-prone this is.  You'd at least stop
> non-Free code from silently sneaking in and finding its way into running
> on users' machines.  That's the *least* someone who runs the
> deblob-scripts on their own should do to smoke-test the result WRT
> *known* freedom issues.

What is this "verify mode" that you're referring to, and where is it
documented?  The word "verify" does not occur in either of the deblob
scripts that I know about, namely "deblob-<VERSION>" and "deblob-check".
The string "verif" occurs a few times, but nothing related to the script
functionality.  I don't see anything like a verification mode mentioned
in the options documented at the top of those two scripts.

For the record, it was not my intent to skip any automated checking
provided by these scripts.  If we're running the scripts in a suboptimal
way, please tell me a better way.

FYI, right now we're simply running the main 'deblob-<VERSION>' script
with no arguments in the unpacked Linux source directory, with the
corresponding 'deblob-check' script in $PATH and $PYTHON pointing to
python 2.x.  If 'deblob-<VERSION>' exits abnormally or with a non-zero
result, the Guix build process fails.

Last I checked, 'deblob-check' was certainly being run by
'deblob-<VERSION>' as a subprocess, because I had to make several
substitutions of hard-coded paths before it would work in Guix
(e.g. /bin/sed and /usr/bin/python).

>> I had hoped that the deblob scripts would typically mostly work, even
>> if they weren't able to do a comprehensive cleaning.
>
> I'd honestly hope for a much higher standard than that for a
> FSDG-compliant distro, especially one that carries the GNU mark.

As I wrote below:

>> I would oppose adding such a partly-cleaned kernel to Guix itself,

With this in mind, your accusation above is not relevant to Guix.

Above, I was talking about my hope to enable users, *on their own
machines* and using *their own private build recipes*, to make a
best-effort deblobbing of a non-standard kernel variant that they need
to use for whatever reason.  If they aren't provided with that option,
the obvious alternative (which I expect 99% of such users would do
anyway) is to simply run a fully-blobbed kernel instead.

> But you don't!  That's what you get when you jump the gun and use
> outdated cleaning up scripts, without waiting for us to verify,
> update and release them for a newer version.

Here you are conflating two substantially different scenarios:

(1) Attempting to use your deblob scripts on a newer kernel that almost
    certainly includes many new drivers and blobs that aren't detected
    by your scripts.  That's the case that I said I would oppose for
    inclusion in Guix.

(2) Using the deblob scripts made for 5.4.57 on a 5.4.58 kernel in order
    to apply security fixes more quickly, and where the probability of
    uncleaned new blobs is quite low.

>> but I wanted to enable users who need to use some other branch of
>> Linux on their own systems to make a best-effort cleaning.
>
> Besides the likelihood of something going wrong, that seems like a
> backwards goal for a distro that is not expected to as much as point
> users at a non-Free package.

It's *not* a goal for Guix, and it wasn't even my motivation for
teaching Guix to run the Linux-libre deblob scripts.  It's just
something that, on a whim, I chose to include in my list of possible
advantages to having such functionality, nothing more.

> I'm sure that's not what you intend, but this arrangement, plus your
> mention of hurriedly getting releases out, adds up to an incentive to
> disable the deblobbing so as to get a faster build.

I don't understand how you reached this conclusion.  As far as I can
tell, changing Guix to run the deblob scripts made *no* difference to
what someone would have to do to ask Guix to build fully-blobbed kernel.

> I hope you'll agree that this is undesirable.

Agreed.

>> In my experience, the deblob scripts are very rarely changed after the
>> first few point releases of a stable release series.
>
> My personal experience tells me otherwise.  5.7 had only one update at
> .8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27,
> .34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18
> and .20; 5.0 at .7 and .16.  What you describe was true only of 4.17,
> 4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases
> starting at 3.0.

I only checked your claims regarding 5.4, and found that you're mistaken
about them being updated in 5.4.44.  In fact, the 'deblob-5.4' and
'deblob-check' files, as found in /pub/linux-libre/releases/, have not
changed since version 5.4.34.

Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have
*actually* been made so far during the 5.4.x series, IIUC only one of
them declared new blobs to remove, namely the update for 5.4.27.

The 5.4.14 update only removed extraneous backslashes in existing
regexps, changing "\e" to "e" and "\@" to "@".  I don't know whether
these extraneous backslashes caused blobs to be included in the
linux-libre tarballs, but if so, that presumably already happened in
5.4.13 and would have happened even if we had used your official
tarballs, no?

The 5.4.18 and 5.4.34 updates only added new 'accept' directives.  I
guess that means that temporarily omitting these additions wouldn't
cause new blobs to be included, is that right?

>> I know this because I always check for updates to the deblob scripts
>> whenever I update linux-libre in Guix.  In practice, the deblob scripts used 
>> by
>> Guix are never more than 1 or 2 micro versions behind the version of
>> Linux they are applied to.
>
> There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu*
> stable releases, so Guix has shipped potentially non-FSDG code, that
> *would* have been flagged by deblob-check on the tarballs, at between 5%
> and 10% of these releases.  Does that sound like a good standard for a
> freedom-first distro to aim for?

If it were true that we've been including blobs in 5-10% of our
linux-libre releases, I agree that would be a serious problem.  However,
I believe your estimates are way off, so I took a closer look at the
statistics for the 5.4, 4.19, and 4.14 kernels.

I already wrote about 5.4 above.  If we include only the deblob updates
that added checks for new blobs, it's only happened once in 58 upstream
updates, i.e. for 1.7% of the updates.

In the 4.19 series, although the deblob scripts have been updated 8
times, of those 8, 3 only add 'accept' directives, and a fourth only
makes the same regexp fixes mentioned above ("\e" -> "e" and "\@" ->
"@").  In other words, only 4 of these deblob updates might result in
new blobs being recognized.  So that's 4 new blob updates out of 139
upstream updates, which comes out to 2.9%.

In the 4.14 series, the deblob scripts were updated 6 times, but 3 only
add 'accept' directives and a fourth only makes the regexp fix.  So that
comes out to 2 new blob updates out of 193 upstream updates, which comes
out to 1.0%.

So, unless I missing something, it's more accurate to say that when I
push a Linux-libre security update before waiting for you to bless it,
I'm taking a 1-3% risk that a blob might end up in the result.

I find that level of risk undesirable.  I would certainly rather avoid
it.  I guess where you and I differ is that I *also* find it undesirable
to subject our users to unnecessary delays in getting these security
updates, because that *also* carries a risk, namely the risk that their
systems will be compromised due to a delayed security update.

To my mind, it makes sense to balance these two risks, especially since
we know that it's simply impractical to completely eliminate the risk of
non-FSDG-compliant code occasionally finding its way into Guix.

>>> The moment that the Linux-libre project determines that scripts are
>>> suitable is the moment that the new cleaned-up release is ready to
>>> publish in git and the appropriate tags will then appear in git. The
>>> compressed tarballs come some time later.
>
>> I prefer to avoid unnecessary delays when applying micro kernel updates,
>
> Sorry, but it doesn't look like you do.  If you did, you would be taking
> a cleaned up tree instead of re-deblobbing it.

I'm not concerned about another 30 minutes (or whatever) to run the
deblob scripts, especially if the alternative is to trust the integrity
of your machines unnecessarily.  The delays I'd prefer to avoid are ones
measured in tens of hours, which is occasionally how long it takes
before Linux-libre reacts to a new upstream update.

> You skip even the automated verification we do, which saves you some
> time, but at what price?

As I wrote above, if there's some automated verification that we are
failing to do, please tell me how to do it.  It was certainly not my
intent to skip any such verification.

>> I also consider it unwise for all of us, as a matter of habit or policy,
>> to trust the integrity of the computer systems used by the Linux-libre
>> project to perform the deblobbing.
>
> I welcome double-checking of our cleaning up at all levels, but why are
> you setting a higher trust standard for us than for a project known to
> be at odds with our shared goals, such as Linux?

I don't understand how you reached the conclusion that I'm setting a
higher trust standard for Linux-libre than for Linux.  The principle I'm
following here is simply to avoid relying on the integrity of any system
if I can easily avoid it.  In particular, if I can easily run an
automated process on my own machine instead of relying on some other
system to provide pre-generated outputs for me, then I prefer to do it
myself.

> You don't apply the patches that went into it since the last known
> good release to double-check their releases, do you?  For most
> projects, you just take their tarballs or tags and build it.

That's true, and I agree that it's something we could improve on.
It would be preferable to fetch from a git repository instead, and
preferably one that has a lot of eyes on it.

> For Linux-libre, you start from (untrustworthy) Linux, run the
> (presumed untrustworthy) cleaning up scripts, and blindly trust the
> result.

I agree that we cannot avoid trusting many people and systems, and that
in most cases that trust is blind.  In this case, we cannot avoid
trusting the Linux source code (even if we download exclusively from the
Linux-libre project), and we cannot avoid trusting the Linux-libre
deblob scripts.  However, I reject the argument that because we must
trust X and Y, we might as well trust Z as well.

> There's no self-verification run with deblob-check,

Again, if we're failing to do that, it's a bug that has not previously
been brought to my attention.  See above.

> no compare with our release, nothing.  If you were to test the
> integrity of our releases, you'd think you'd at least look at them.

I *did* compare with your releases when I first taught Guix how to run
the deblob scripts, but not since then.  Anyway, I fail to see the
relevance of this fact.  I agree that it would be useful for someone
running Guix to compare our generated tarballs to yours.  There are
millions of useful things I *could* do with my time, but alas, my
energies are limited.

> If you were to test the integrity of our releases, you'd think you'd at
> least look at them.  Starting from a known-good Linux release and
> applying patches to double-check the results is expensive, so it makes
> sense to do that only occasionally, rather than as part of every build.
> Deblobbing and checking the result is also expensive, so it also makes
> sense for you to do so only occasionally, rather than as part of every
> build.

As far as I can tell, the vast majority of Guix users use substitutes
provided by its build farm.  I guess that it's fairly rare for people to
build everything on their own machines, as I do.

> But the point stands that, for someone who'd rather trust no one, you're
> blindly trusting both Linux and Linux-libre.  The former when it comes
> to base releases you don't check; the latter when it comes to scripts
> whose results you hardly even look at.  Why not reduce your trust base
> to just Linux-libre,

That's not possible.  Clearly, you do not have the capacity to audit all
of the code that Linux produces.  Therefore, by trusting Linux-libre, we
must implicitly also trust the Linux project.  That much we cannot
avoid.  We also cannot avoid trusting your deblob scripts.

However, we *can* easily avoid trusting the integrity of the systems
that you use to run the deblob scripts.

> and treat is as a citizen of the same class as
> nearly every other project you build, and satisfy your trust-but-verify
> needs looking into what changes between one of our releases and another?

You seem to be suggesting that I'm treating Linux-libre with less
respect than other projects in Guix.  I reject that claim.

In fact, I strongly support reducing Guix's reliance on pre-generated
outputs produced by *any* project.  I'm not singling out the Linux-libre
project here.

For example, one of the things I'm recently been thinking about is that
Guix currently trusts the integrity of all the scripts generated by
autoconf/automake/libtool/etc for most of the tarballs that we download.
Those scripts are generated on random developer machines, and they are
very difficult to reproduce, because they depend on the precise versions
of many other packages, and Debian also seems to have extensively
modified their automake, leading to other differences.  I would be in
favor of working toward generating those scripts in Guix itself where
possible, but it's a big job and likely to cause maintenance headaches.

For another example, I also taught Guix how to generate the IceCat
source tarball from the corresponding Firefox tarball, and I intend to
keep it that way, although I'm currently an IceCat maintainer.

>> One question: Would it solve the problem that I mentioned in my earlier
>> email, namely the problem of how to determine which precise commit
>> introduced a regression between two stable kernel releases?
>
> No.  There are much better (faster and less risky) ways to tend to that
> requirement, see #bisecting below.
[...]
> #bisecting
>
> You can even take one of our releases and apply the patches that went
> into the next upstream stable release, and check that what you get
> matches our own corresponding release.  Some 98% of the time, they will
> be exact matches.  Occasionally, there will be a difference, and then
> you'll likely find a corresponding change in the deblobbing scripts, or
> a preexisting pattern that caused the change.  We do this for every
> release, as part of our pre-release checks, and you're welcome to do so
> as well, and to use the resulting tree to bisect problems.

I agree that this would be faster, but I fail to see how it's "less
risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a
git checkout of the upstream stable git repo between X.Y.(Z-1) and
X.Y.Z.

More importantly, it's a much less straightforward thing to implement.
In the current implementation, we get the ability to deblob arbitrary
git commits from the same stable branch essentially for free.

I guess you're suggesting that I should implement a radically different
mechanism specifically for this purpose, that extracts the individual
patches from the upstream stable git repository, attempt to apply them
to the base Linux-libre release, compare that to the next Linux-libre
release, and then implement my own bisection functionality.

If I were to implement this, what would you suggest I do if the patches
fail to apply, or if the result fails to match the next Linux-libre
release?

     Thanks,
       Mark

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Linux-libre 5.8 and beyond, (continued)
- Re: Linux-libre 5.8 and beyond, Leo Famulari, 2020/08/25
  - Re: Linux-libre 5.8 and beyond, Leo Famulari, 2020/08/25
    - Re: Linux-libre 5.8 and beyond, Katherine Cox-Buday, 2020/08/26
- Re: Linux-libre 5.8 and beyond, Jason Self, 2020/08/09
  - Re: Linux-libre 5.8 and beyond, Mark H Weaver, 2020/08/12
    - Re: Linux-libre git repository, Vagrant Cascadian, 2020/08/13
    - Re: Linux-libre git repository, Jason Self, 2020/08/13
    - Re: Linux-libre git repository, Danny Milosavljevic, 2020/08/14
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/14
    - Re: Linux-libre 5.8 and beyond, Mark H Weaver <=
    - Re: Linux-libre 5.8 and beyond, Mark H Weaver, 2020/08/15
    - Re: Linux-libre 5.8 and beyond, Jason Self, 2020/08/16
    - Re: Linux-libre 5.8 and beyond, Jason Self, 2020/08/16
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/23
    - Re: Linux-libre 5.8 and beyond, Mark H Weaver, 2020/08/25
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/25
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/23
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/24
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/24
    - Re: Linux-libre 5.8 and beyond, Alexandre Oliva, 2020/08/24

Prev by Date: Re: merge wip-haskell?
Next by Date: Re: File search progress: database review and question on triggers
Previous by thread: Re: Linux-libre 5.8 and beyond
Next by thread: Re: Linux-libre 5.8 and beyond
Index(es):
- Date
- Thread