[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code ge
From: |
Alex Bennée |
Subject: |
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators |
Date: |
Thu, 23 Nov 2023 15:32:44 +0000 |
User-agent: |
mu4e 1.11.25; emacs 29.1 |
Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:
> On Thu, 23 Nov 2023 16:35, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>On Thu, Nov 23, 2023 at 11:40:26AM +0000, Daniel P. Berrangé wrote:
>>> There has been an explosion of interest in so called "AI" (LLM)
>>> code generators in the past year or so. Thus far though, this is
>>> has not been matched by a broadly accepted legal interpretation
>>> of the licensing implications for code generator outputs. While
>>> the vendors may claim there is no problem and a free choice of
>>> license is possible, they have an inherent conflict of interest
>>> in promoting this interpretation. More broadly there is, as yet,
>>> no broad consensus on the licensing implications of code generators
>>> trained on inputs under a wide variety of licenses.
>>> The DCO requires contributors to assert they have the right to
>>> contribute under the designated project license. Given the lack
>>> of consensus on the licensing of "AI" (LLM) code generator output,
>>> it is not considered credible to assert compliance with the DCO
>>> clause (b) or (c) where a patch includes such generated code.
>>> This patch thus defines a policy that the QEMU project will not
>>> accept contributions where use of "AI" (LLM) code generators is
>>> either known, or suspected.
>>> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>>> ---
>>> docs/devel/code-provenance.rst | 40 ++++++++++++++++++++++++++++++++++
>>> 1 file changed, 40 insertions(+)
>>> diff --git a/docs/devel/code-provenance.rst
>>> b/docs/devel/code-provenance.rst
>>> index b4591a2dec..a6e42c6b1b 100644
>>> --- a/docs/devel/code-provenance.rst
>>> +++ b/docs/devel/code-provenance.rst
>>> @@ -195,3 +195,43 @@ example::
>>> Signed-off-by: Some Person <some.person@example.com>
>>> [Rebased and added support for 'foo']
>>> Signed-off-by: New Person <new.person@example.com>
>>> +
>>> +Use of "AI" (LLM) code generators
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +TL;DR:
>>> +
>>> + **Current QEMU project policy is to DECLINE any contributions
>>> + which are believed to include or derive from "AI" (LLM)
>>> + generated code.**
>>> +
>>> +The existence of "AI" (`Large Language Model
>>> <https://en.wikipedia.org/wiki/Large_language_model>`__
>>> +/ LLM) code generators raises a number of difficult legal questions, a
>>> +number of which impact on Open Source projects. As noted earlier, the
>>> +QEMU community requires that contributors certify their patch submissions
>>> +are made in accordance with the rules of the :ref:`dco` (DCO). When a
>>> +patch contains "AI" generated code this raises difficulties with code
>>> +provenence and thus DCO compliance.
>>> +
<snip>
>>> +
>>> +The QEMU maintainers thus require that contributors refrain from using
>>> +"AI" code generators on patches intended to be submitted to the project,
>>> +and will decline any contribution if use of "AI" is known or suspected.
>>> +
>>> +Examples of tools impacted by this policy includes both GitHub CoPilot,
>>> +and ChatGPT, amongst many others which are less well known.
>>
>>
>>So you called out these two by name, fine, but given "AI" is in scare
>>quotes I don't really know what is or is not allowed and I don't know
>>how will contributors know. Is the "AI" that one must not use
>>necessarily an LLM? And how do you define LLM even? Wikipedia says
>>"general-purpose language understanding and generation".
>>
>>
>>All this seems vague to me.
>>
>>
>>However, can't we define a simpler more specific policy?
>>For example, isn't it true that *any* automatically generated code
>>can only be included if the scripts producing said code
>>are also included or otherwise available under GPLv2?
>
> The following definition makes sense to me:
>
> - Automated codegen tool must be idempotent.
> - Automated codegen tool must not use statistical modelling.
>
> I'd remove all AI or LLM references. These are non-specific,
> colloquial and in the case of `AI`, non-technical. This policy should
> apply the same to a Markov chain code generator.
I'm fairly sure my Emacs auto-complete would fail by that definition.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, (continued)
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Kevin Wolf, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Manos Pitsidianakis, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Philippe Mathieu-Daudé, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michal Suchánek, 2023/11/23
- Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators,
Alex Bennée <=
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Daniel P . Berrangé, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Peter Maydell, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Kevin Wolf, 2023/11/24
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/24
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Manos Pitsidianakis, 2023/11/24
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Daniel P . Berrangé, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/23
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Daniel P . Berrangé, 2023/11/24
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Michael S. Tsirkin, 2023/11/24
Re: [PATCH 2/2] docs: define policy forbidding use of "AI" / LLM code generators, Alex Bennée, 2023/11/24