guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About SWH, let avoid the wrong discussion


From: Liliana Marie Prikler
Subject: Re: About SWH, let avoid the wrong discussion
Date: Fri, 21 Jun 2024 18:34:43 +0200
User-agent: Evolution 3.48.4

Hi, MSavoritias,

Am Freitag, dem 21.06.2024 um 17:15 +0300 schrieb MSavoritias:
> But I didnt say that tho did I? the context you are reading as from
> the quote is Guix uploading all code from its packages to SWH.
> Not any private repos. So i have no idea what you are reffering to
> here tbh.
I hate to say that, but you kinda did.  It was implicit on the mailing
list (at least in the OP), but very explicit in the XMPP room, where
you say
"it automatically sen[d]s your repo (and all your code) that is
reachable through the internet to Software Heritage […] with no way to
opt-out at any of the process and no flag with `guix lint` to disable
it"

Now, you stand corrected on both accounts (the automatic sending of
code and the inability to disable it), but I'd like to poke at another
tangent.

Currently, the StarCoder LLM endorsed by SWH, claims to only ingest
GitHub and to filter out both commercial and copyleft code, thus
training on non-copyleft "open source" software only [1].  So, at the
time of writing, you do have an "easy" opt-out by way of using the GPL.

Except, that, of course, their script to detect licenses is buggy –
what else did you expect?  Just search for GNOME using their tool.[2] 
It will print out repos like the unlicensed releng [3] – although for
some reason, being unlicensed appears to be fair game to them anyway
[1] – or the GPL'd devhelp [4].

So, in my opinion, the collaboration between SWH and StarCoder should
trigger some side-eyeing; and if only to exclude the archival lint for
the time being.  We can still consider SWH as a software mirror if all
else fails, and they should probably be quick enough in updating as
well.  Long term, we might want to look into options that do not openly
endorse tools which make such questionable decisions.

On the notion of consent, I do think that "I license my code under the
MIT license, because then companies will like me" ought to count as
consent here.  [3] and [4] on the other hand very much don't.  Also,
"sign up with GitHub, so that you can opt out" is not a great consent
model either – at the very least accept bleeping email.

As per Doctorow's law of enshittification, there is a good chance that
"ethical AI" to SWH will become "any AI" if we do nothing to
communicate that this is not what we as Guix expect.

Cheers

[1] https://arxiv.org/abs/2402.19173
[2] https://huggingface.co/spaces/bigcode/in-the-stack
[3] https://github.com/GNOME/releng
[4] https://github.com/GNOME/devhelp



reply via email to

[Prev in Thread] Current Thread [Next in Thread]