Re: Google Summer of Code 2023 Inquiry

guix-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Google Summer of Code 2023 Inquiry

From:	Simon Tournier
Subject:	Re: Google Summer of Code 2023 Inquiry
Date:	Tue, 04 Apr 2023 19:15:54 +0200

Hi Kyle,

On Tue, 04 Apr 2023 at 14:32, Kyle <kyle@posteo.net> wrote:

>           The CRAN importer, for example, cannot yet detect non-R
> dependencies. So, the profile author has to figure those out for
> themselves. It's still very useful despite not being perfect.  

Yeah, improving the importers is very helpful…

> Sure, but as is shown with "guix import cran" as I previously
> mentioned, it doesn't have to be perfect to be really useful in many
> cases.

…but please note the R ecosystem is probably one of the best around.

Well, I will not extrapolate to other ecosystem as Python or else based
on what Lars did with the channel guix-cran [1].

For more details, give a look to this thread [2],

        Accuracy of importers?
        Ludovic Courtès <ludovic.courtes@inria.fr>
        Thu, 28 Oct 2021 09:02:27 +0200

or slide 53 of
https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/packaging-con-2021/grail/talk.20211110.pdf

In addition, quoting another discussion from [3]:

        Well, it strongly depends on the quality of the targeted language
        ecosystem.  For some, they provide enough metadata to rely on for good
        automatizing; for instance, R with CRAN or Bioconductor.

        Sadly, for many others ecosystem, they (upstream) do not provide enough
        metadata to automatically fill all the package fields.  And some manual
        tweaks are required.

        For example, let count the number of packages that are tweaking their
        ’arguments’ fields (from ’#:tests? #f’ to complex phases modifications).
        This is far from being a perfect metrics but it is a rough indication
        about upstream quality: if they provide clean package respecting their
        build system or if the package requires Guix adjustments.

        Well, I get:

              r            : 2093 = 2093 = 1991 + 102 

        which is good (only ~5% require ’arguments’ tweaks), but

              python       : 2630 = 2630 = 803  + 1827

        is bad (only ~31% do not require an ’arguments’ tweak).

and the analysis can be refined, for instance which keyword ’arguments’
are they tweaked?  I did it [4] for the emacs-build-system:

                emacs        : 1234 = 1234 = 878  + 356
                    ("phases" . 213)
                    ("tests?" . 144)
                    ("test-command" . 127)
                    ("include" . 87)
                    ("emacs" . 25)
                    ("exclude" . 20)
                    ("modules" . 7)
                    ("imported-modules" . 4)
                    ("parallel-tests?" . 1) 

        Considering this 356 packages, 144 modifies the keyword #:tests?.  Note
        that ’#:tests? #t’ is counted in these 144 and it reads,

            $ ag 'tests\? #t' gnu/packages/emacs-xyz.scm | wc -l
            117

        Ah!  It requires some investigations. :-)

Last, in addition to ideas of improvements provided by the thread [3,4],
the conclusion is still:

        Indeed, it could be worth to identify common sources of the extra
        modifications we are doing compared to the default emacs-build-system.

Yeah, improving the importers is very helpful! :-)

Well, considering that 95% of the current R packages in Guix just work
out-of-the-box from the CRAN metadata, and considering how many packages
guix-cran provides compared to how many packages CRAN provides, we can
roughly extrapolate the meaning of “doesn't have to be perfect” for
other ecosystem as Python or else.  Roughly speaking, consider the 30%
of the current Python packages in Guix that are working out-of-the-box.

Yeah, these numbers are very partial and finer analysis could help in
improving the importers.  But these numbers show that the conclusion
drawn from the CRAN example would not apply as-is for others, IMHO.

1: 
https://hpc.guix.info/blog/2022/12/cran-a-practical-example-for-being-reproducible-at-large-scale-using-gnu-guix/
2: https://yhetil.org/guix/878ryd8we4.fsf@inria.fr/#r
3: https://yhetil.org/guix/86cz9kk71y.fsf@gmail.com
4: https://yhetil.org/guix/87cz9gunwx.fsf@gmail.com

Cheers,
simon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Google Summer of Code 2023 Inquiry, Spencer Skylar Chan, 2023/04/03
- Re: Google Summer of Code 2023 Inquiry, Kyle, 2023/04/04
- Re: Google Summer of Code 2023 Inquiry, Simon Tournier, 2023/04/04
  - Re: Google Summer of Code 2023 Inquiry, Kyle, 2023/04/04
    - Re: Google Summer of Code 2023 Inquiry, Simon Tournier <=

Prev by Date: Re: Google Summer of Code 2023 Inquiry
Next by Date: Re: [GSoC 23] distributed substitutes, cost of storage
Previous by thread: Re: Google Summer of Code 2023 Inquiry
Next by thread: Re: Contributing Guix Home services
Index(es):
- Date
- Thread