[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Concerns/questions around Software Heritage Archive
From: |
Daniel Littlewood |
Subject: |
Re: Concerns/questions around Software Heritage Archive |
Date: |
Mon, 18 Mar 2024 17:39:08 +0000 |
Hi Kaelyn,
The legal question is unsettled, and there is ongoing litigation by
(at least) Matthew Butterick in the US, since at least 2022. The
reasonable positions I'm aware of are:
1. An LLM (or, more precisely, the set of weights that define it) is
not a derivative work of its training data, for the purposes of
copyright, and thus the license is irrelevant.
2. Producing an LLM from training data is a transformative fair use,
and thus the license is irrelevant.
3. Neither 1 nor 2 holds, and LLMs constitute copyright infringement
on a profound scale (of both copyrighted and copylefted works).
The FSF and CC have both commissioned white papers on the impact of
such considerations for Free works. I don't recall seeing anything
particularly insightful in them. Probably a waste of time to discuss
it here.
Best wishes,
Dan