[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Upstreaming parsebib.el
From: |
Philip Kaludercic |
Subject: |
Re: Upstreaming parsebib.el |
Date: |
Wed, 29 Jan 2025 21:23:11 +0000 |
Joost Kremers <joostkremers@fastmail.fm> writes:
> Hi,
>
> I maintain a package `parsebib`[1] for parsing .bib files, and recently,
> Ihor Radchenko (Cc:'ed) suggested that it could be useful to upstream it.
> The package is used by most (if not all) bibliography-related packages
> available on MELPA, and Ihor feels it might be useful to have it in core,
> to make it available to packages in core and on GNU ELPA, (possibly
> including org-mode itself).
As for adding the package ELPA: That can certainly be done, could you
just review these comments and suggestions:
diff --git a/parsebib.el b/parsebib.el
index fe4e528..97d8a99 100644
--- a/parsebib.el
+++ b/parsebib.el
@@ -196,7 +196,7 @@ combination, the field inherits from the same-name field in
the
cross-referenced entry. If no inheritance should take place, the
target field is set to the symbol `none'.")
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; <-perhaps convert this to a outline-header?
;; BibTeX / biblatex parser ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
@@ -402,7 +402,7 @@ An assignment is the combination of an identifier, an equal
sign
and a composed value. A @String definition has exactly one
assignment, an entry has a potentially unlimited number."
(if-let* ((id (parsebib--identifier))
- (_ (parsebib--char '(?=)))
+ ((parsebib--char '(?=)))
(val (parsebib--composed-value)))
(cons id val)
(signal 'parsebib-error (list (point) "Malformed key=value assignment"))))
@@ -444,11 +444,11 @@ Return the contents of the @Preamble as a string."
"Parse an @String definition.
Return the definition as a cons cell of the abbreviation and a
composed value as a list."
- (if-let* ((_ (parsebib--char '(?@)))
- (_ (parsebib--keyword '("string")))
+ (if-let* (((parsebib--char '(?@)))
+ ((parsebib--keyword '("string")))
(open (parsebib--char '(?\{ ?\( )))
(definition (parsebib--assignment))
- (_ (parsebib--char (alist-get open '((?\{ ?\}) (?\( ?\)))))))
+ ((parsebib--char (alist-get open '((?\{ ?\}) (?\( ?\)))))))
definition
(signal 'parsebib-error (list (point) "Malformed @String definition"))))
@@ -531,7 +531,7 @@ See the function `parsebib--post-process' for details."
(parsebib--post-process field strings replace-TeX))
entry))
(when parsebib-hashid-fields
- (push (cons "=hashid=" (secure-hash 'sha256 (parsebib--get-hashid-string
fields))) entry))
+ (push (cons "=hashid=" (secure-hash 'sha256 (parsebib--get-hashid-string
fields))) entry)) ;why are you using strings for keys?
entry))
(defun parsebib-read-string (&optional strings)
@@ -654,8 +654,8 @@ such an inheritance schema."
(string-match-p (concat "\\b"
(cdr (assoc-string "=type=" target-entry)) "\\b")
(cl-second
elem))))
inheritance)))
- (cl-third (assoc-string "all" inheritance)))))
- (new-fields (delq nil (mapcar (lambda (field)
+ (cl-third (assoc-string "all" inheritance))))) ;i think that
nth 3 is slightly faster
+ (new-fields (delq nil (mapcar (lambda (field) ;mapcan can
avoid unnecessary consing, if that is important
(let ((target-field
(parsebib--get-target-field (car field) inheritable-fields)))
(if (and target-field
(not (assoc-string
target-field target-entry 'case-fold)))
@@ -699,7 +699,7 @@ for details.")
Depending on the value of `parsebib-TeX-cleanup-target', add a
face property `italic' to STR, or return it with Markdown or Org
markup for italic text."
- (pcase parsebib-TeX-cleanup-target
+ (pcase parsebib-TeX-cleanup-target ;perhaps `pcase-exhaustive'?
('display (propertize str 'face 'italic))
('markdown (concat "*" str "*"))
('org (concat "/" str "/"))
@@ -716,7 +716,7 @@ markup for bold text."
('org (concat "*" str "*"))
('plain str)))
-(defun parsebib--convert-tex-small-caps (str)
+(defun parsebib--convert-tex-small-caps (str) ;why define this function at all?
"Return STR capitalised."
(upcase str))
@@ -868,8 +868,8 @@ See `parsebib-TeX-markup-replacement-alist' and the function
"{" (group-n 2 (0+ (not (any "}")))) (opt
"}"))
(group-n 3 letter))))))
(parsebib--TeX-replace-literal
- . ,(rx-to-string `(or ,@(mapcar #'car
parsebib-TeX-literal-replacement-alist)
- (1+ blank)))))
+ . ,(rx (or (regexp (regexp-opt (mapcar #'car
parsebib-TeX-literal-replacement-alist))) ;this should avoid the runtime rx
dependency
+ (1+ blank)))))
"Alist of replacements and strings for TeX markup.
This is used in `parsebib-clean-TeX-markup' to make TeX markup more
suitable for display. Each item in the list consists of a replacement
@@ -940,7 +940,7 @@ Return a list of strings, each string a separate @Preamble."
(goto-char (point-min))
(let (res)
(cl-loop for item = (parsebib-find-next-item)
- while item do
+ while item do ;the `when' confused me, why not pull
it into the next `when'?
(when (cl-equalp item "preamble")
(push (parsebib--@preamble) res)))
(nreverse res))))
@@ -1227,11 +1227,10 @@ field, a `parsebib-error' is raised."
ENTRY is a CSL-JSON entry in the form of an alist. ENTRY is
modified in place. Return value is ENTRY. If YEAR-ONLY is
non-nil, date fields are shortened to just the year."
- (mapc (lambda (field)
- (unless (stringp (alist-get field entry))
- (setf (alist-get field entry)
- (parsebib-stringify-json-field (assq field entry)
year-only))))
- (mapcar #'car entry))
+ (dolist (field entry)
+ (unless (stringp (alist-get (car field) entry))
+ (setf (alist-get (car field) entry)
+ (parsebib-stringify-json-field (assq (car field) entry)
year-only))))
entry)
(defvar parsebib--json-name-fields '(author
@@ -1280,15 +1279,15 @@ Braced occurrences of the keys in ITEMS are replaced
with the
corresponding values. Note that the keys in ITEMS should be
symbols."
(cl-flet ((create-replacements (match)
- (save-match-data
- (string-match
"{\\([^A-Za-z]*\\)\\([A-Za-z][A-za-z-]+\\)\\([^A-Za-z]*\\)}" match)
- (let* ((pre (match-string 1 match))
- (key (match-string 2 match))
- (post (match-string 3 match))
- (value (alist-get (intern key)
items)))
- (if value
- (format "%s%s%s" pre value post)
- "")))))
+ (save-match-data
+ (string-match
"{\\([^A-Za-z]*\\)\\([A-Za-z][A-za-z-]+\\)\\([^A-Za-z]*\\)}" match)
+ (let* ((pre (match-string 1 match))
+ (key (match-string 2 match))
+ (post (match-string 3 match))
+ (value (alist-get (intern key) items)))
+ (if value
+ (format "%s%s%s" pre value post)
+ "")))))
(replace-regexp-in-string "{.*?}" #'create-replacements template nil t)))
(defun parsebib-stringify-json-field (field &optional short)
@@ -1436,24 +1435,23 @@ details. If FIELDS is nil, all fields are returned."
(setq strings (make-hash-table :test #'equal)))
(when (stringp files)
(setq files (list files)))
- (mapc (lambda (file)
- (with-temp-buffer
- (insert-file-contents file)
- (cond
- ((string= (file-name-extension file t) ".bib")
- (parsebib-parse-bib-buffer :entries entries
- :strings strings
- :expand-strings display
- :inheritance display
- :fields fields
- :replace-TeX display))
- ((string= (file-name-extension file t) ".json")
- (parsebib-parse-json-buffer :entries entries
- :stringify display
- :year-only display
- :fields (mapcar #'intern fields)))
- (t (error "[Parsebib] Not a bibliography file: %s" file)))))
- files)
+ (dolist (file files)
+ (with-temp-buffer
+ (insert-file-contents file)
+ (cond
+ ((string= (file-name-extension file t) ".bib")
+ (parsebib-parse-bib-buffer :entries entries
+ :strings strings
+ :expand-strings display
+ :inheritance display
+ :fields fields
+ :replace-TeX display))
+ ((string= (file-name-extension file t) ".json")
+ (parsebib-parse-json-buffer :entries entries
+ :stringify display
+ :year-only display
+ :fields (mapcar #'intern fields)))
+ (t (error "[Parsebib] Not a bibliography file: %s" file)))))
entries)
(provide 'parsebib)
The question then is if we want to wait until the package is added to
the core and track that or add it to ELPA right now?
>
> I'll let Ihor chime in with his motivations for making the initial
> suggestion, but to make the case, a few things to note:
>
> - `parsebib` is a library for parsing .bib files, i.e., it is not a major
> mode for editing them. IOW, it's not user-facing, it's intended for
> developers. (Also, it doesn't compete with `bibtex.el`.)
> - There is, therefore, no machinery to check for the validity of a .bib
> entry (e.g., whether the entry type is valid, whether all required fields
> are present etc.) Syntactically incorrect entries *are* rejected, of
> course.
> - What `parsebib` does is read the contents of a .bib file and return it as
> a hash table (plus, optionally some additional data structures).
> - Optionally, entries can be 'prettified', which means that @String
> abbreviations can be expanded, crossref's can be resolved (i.e., if an
> entry has a crossref field, the contents of the cross-referenced entry is
> added to the cross-referencing entry), and LaTeX commands can be removed
> or replaced with something better-looking. (These prettifications aren't
> applied to the .bib file, of course: they're meant for displaying the
> .bib entries to the user.)
> - The parser is a recursive-descent parser.[2]
> - `parsebib` can also parse CSL-JSON files (using Emacs' built-in JSON
> support).
>
> One question that would arise is obviously compatibility with `bibtex.el`.
> `parsebib` tends to favour biblatex over BibTeX, which IIUC, is not the
> case for `bibtex.el` (CC: Roland Winkler).
>
> From a copyright point of view: most of the code is mine, and I have signed
> the copyright assignment form. The code that handles prettifying LaTeX code
> comes from two other contributors, who both have signed copyright
> assignments as well. There are several smaller contributions, which,
> however, appear to fall under the 15 LOC limit. (Though obviously that
> would need to be reviewed more thoroughly.)
>
> WDYallT?
>
>
> Joost
>
>
>
>
>
> Footnotes:
>
> [1] https://github.com/joostkremers/parsebib
>
> [2] It's my first attempt at writing an RDP, so it's probably not as
> optimised as it can be. That's something I could work on.