[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Google Summer of Code 2023 Inquiry
From: |
Simon Tournier |
Subject: |
Re: Google Summer of Code 2023 Inquiry |
Date: |
Tue, 04 Apr 2023 19:15:54 +0200 |
Hi Kyle,
On Tue, 04 Apr 2023 at 14:32, Kyle <kyle@posteo.net> wrote:
> The CRAN importer, for example, cannot yet detect non-R
> dependencies. So, the profile author has to figure those out for
> themselves. It's still very useful despite not being perfect.
Yeah, improving the importers is very helpful…
> Sure, but as is shown with "guix import cran" as I previously
> mentioned, it doesn't have to be perfect to be really useful in many
> cases.
…but please note the R ecosystem is probably one of the best around.
Well, I will not extrapolate to other ecosystem as Python or else based
on what Lars did with the channel guix-cran [1].
For more details, give a look to this thread [2],
Accuracy of importers?
Ludovic Courtès <ludovic.courtes@inria.fr>
Thu, 28 Oct 2021 09:02:27 +0200
or slide 53 of
https://git.savannah.gnu.org/cgit/guix/maintenance.git/plain/talks/packaging-con-2021/grail/talk.20211110.pdf
In addition, quoting another discussion from [3]:
Well, it strongly depends on the quality of the targeted language
ecosystem. For some, they provide enough metadata to rely on for good
automatizing; for instance, R with CRAN or Bioconductor.
Sadly, for many others ecosystem, they (upstream) do not provide enough
metadata to automatically fill all the package fields. And some manual
tweaks are required.
For example, let count the number of packages that are tweaking their
’arguments’ fields (from ’#:tests? #f’ to complex phases modifications).
This is far from being a perfect metrics but it is a rough indication
about upstream quality: if they provide clean package respecting their
build system or if the package requires Guix adjustments.
Well, I get:
r : 2093 = 2093 = 1991 + 102
which is good (only ~5% require ’arguments’ tweaks), but
python : 2630 = 2630 = 803 + 1827
is bad (only ~31% do not require an ’arguments’ tweak).
and the analysis can be refined, for instance which keyword ’arguments’
are they tweaked? I did it [4] for the emacs-build-system:
emacs : 1234 = 1234 = 878 + 356
("phases" . 213)
("tests?" . 144)
("test-command" . 127)
("include" . 87)
("emacs" . 25)
("exclude" . 20)
("modules" . 7)
("imported-modules" . 4)
("parallel-tests?" . 1)
Considering this 356 packages, 144 modifies the keyword #:tests?. Note
that ’#:tests? #t’ is counted in these 144 and it reads,
$ ag 'tests\? #t' gnu/packages/emacs-xyz.scm | wc -l
117
Ah! It requires some investigations. :-)
Last, in addition to ideas of improvements provided by the thread [3,4],
the conclusion is still:
Indeed, it could be worth to identify common sources of the extra
modifications we are doing compared to the default emacs-build-system.
Yeah, improving the importers is very helpful! :-)
Well, considering that 95% of the current R packages in Guix just work
out-of-the-box from the CRAN metadata, and considering how many packages
guix-cran provides compared to how many packages CRAN provides, we can
roughly extrapolate the meaning of “doesn't have to be perfect” for
other ecosystem as Python or else. Roughly speaking, consider the 30%
of the current Python packages in Guix that are working out-of-the-box.
Yeah, these numbers are very partial and finer analysis could help in
improving the importers. But these numbers show that the conclusion
drawn from the CRAN example would not apply as-is for others, IMHO.
1:
https://hpc.guix.info/blog/2022/12/cran-a-practical-example-for-being-reproducible-at-large-scale-using-gnu-guix/
2: https://yhetil.org/guix/878ryd8we4.fsf@inria.fr/#r
3: https://yhetil.org/guix/86cz9kk71y.fsf@gmail.com
4: https://yhetil.org/guix/87cz9gunwx.fsf@gmail.com
Cheers,
simon