[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH]: Rewrite CRAN importer.
From: |
Ludovic Courtès |
Subject: |
Re: [PATCH]: Rewrite CRAN importer. |
Date: |
Fri, 04 Dec 2015 15:23:37 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Ricardo Wurmus <address@hidden> skribis:
> So I rewrote the CRAN importer to do this:
>
> * download the DESCRIPTION file for a given package
> * break it up into a simple alist
> * transform the alist into a package expression
>
> This is much simpler than the sxml hackery we did before and the code
> can be reused to write an importer for Bioconductor, a popular,
> versioned R package repository for bioinformatics packages.[1]
Sounds great!
> From f2455d19461d50c775360aafb60c104281f83483 Mon Sep 17 00:00:00 2001
> From: Ricardo Wurmus <address@hidden>
> Date: Thu, 3 Dec 2015 16:12:09 +0100
> Subject: [PATCH] import: cran: Parse DESCRIPTION instead of HTML.
>
> * guix/import/cran.scm (description->alist, safe-car, listify,
> beautify-description, description->package): New procedures.
> (table-datum, downloads->url, nodes->text, cran-sxml->sexp): Remove
> proceduces.
> (latest-release): Use parsed DESCRIPTION instead of SXML.
> * tests/cran.scm: Rewrite to match importer.
> ---
> guix/import/cran.scm | 270
> +++++++++++++++++++++++++--------------------------
> tests/cran.scm | 189 +++++++++++++++---------------------
> 2 files changed, 214 insertions(+), 245 deletions(-)
Bonus points for making it shorter. :-)
> +(define (safe-car maybe-pair)
Does it have airbags?
> + (if (or (null? maybe-pair)
> + (not maybe-pair))
> + #f
> + (car maybe-pair)))
Seriously, I think this is the wrong way to deal with that.
> +(define (package->cran-name package)
> + "Given a Guix PACKAGE value, return the name of the R package on CRAN."
> + (let* ((source-url (and=> (package-source package)
> + (compose safe-car origin-uri)))
I would change it to:
(match (package-source package)
((? origin? origin)
(match (origin-uri origin)
((url rest ...)
(let ((end (string-rindex url #\_)) …) …))
(_ #f)))
(_ #f))
This is more verbose, but it makes the intent clearer IMO, and avoids
“wrong-type-arg #f” errors that would arise with the proposed code.
> + (let ((url (string-append %cran-url name "/DESCRIPTION")))
> + (call-with-temporary-output-file
> + (lambda (temp port)
> + (and (url-fetch url temp)
> + (call-with-input-file temp
> + (compose description->alist read-string)))))))
I think it’s best to use ‘http-fetch’ or ‘http-fetch/cached’ from (guix
http-client), which does not necessitate the creation of a temporary
file.
> +(define (beautify-description description)
> + "Improve the package DESCRIPTION by turning a beginning sentence fragment
> +into a proper sentence and by using two spaces between sentences."
Excellent. :-)
> +URL: http://gnu.org/s/my-example
> +Description: This is a long description
> +spanning multiple lines: and it could confuse the parser that
> +there is a colon : on the lines.
> + And: this line continues the description.
> +biocViews: 0
> +SystemRequirements: Cairo (>= 0)
> +Depends: A C++11 compiler. Version 4.6.* of g++ (as
> + currently in Rtools) is insufficient; versions 4.8.*, 4.9.* or
> + later will be fine.
Too bad that this format is almost, but not quite, recutils.
The rest LGTM!
Can you send an updated patch?
Thank you!
Ludo’.