emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Upstreaming parsebib.el


From: Philip Kaludercic
Subject: Re: Upstreaming parsebib.el
Date: Wed, 29 Jan 2025 21:23:11 +0000

Joost Kremers <joostkremers@fastmail.fm> writes:

> Hi,
>
> I maintain a package `parsebib`[1] for parsing .bib files, and recently,
> Ihor Radchenko (Cc:'ed) suggested that it could be useful to upstream it.
> The package is used by most (if not all) bibliography-related packages
> available on MELPA, and Ihor feels it might be useful to have it in core,
> to make it available to packages in core and on GNU ELPA, (possibly
> including org-mode itself).

As for adding the package ELPA: That can certainly be done, could you
just review these comments and suggestions:

diff --git a/parsebib.el b/parsebib.el
index fe4e528..97d8a99 100644
--- a/parsebib.el
+++ b/parsebib.el
@@ -196,7 +196,7 @@ combination, the field inherits from the same-name field in 
the
 cross-referenced entry.  If no inheritance should take place, the
 target field is set to the symbol `none'.")
 
-;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; <-perhaps convert this to a outline-header?
 ;; BibTeX / biblatex parser ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
@@ -402,7 +402,7 @@ An assignment is the combination of an identifier, an equal 
sign
 and a composed value.  A @String definition has exactly one
 assignment, an entry has a potentially unlimited number."
   (if-let* ((id (parsebib--identifier))
-            (_ (parsebib--char '(?=)))
+            ((parsebib--char '(?=)))
             (val (parsebib--composed-value)))
       (cons id val)
     (signal 'parsebib-error (list (point) "Malformed key=value assignment"))))
@@ -444,11 +444,11 @@ Return the contents of the @Preamble as a string."
   "Parse an @String definition.
 Return the definition as a cons cell of the abbreviation and a
 composed value as a list."
-  (if-let* ((_ (parsebib--char '(?@)))
-            (_ (parsebib--keyword '("string")))
+  (if-let* (((parsebib--char '(?@)))
+            ((parsebib--keyword '("string")))
             (open (parsebib--char '(?\{ ?\( )))
             (definition (parsebib--assignment))
-            (_ (parsebib--char (alist-get open '((?\{ ?\}) (?\( ?\)))))))
+            ((parsebib--char (alist-get open '((?\{ ?\}) (?\( ?\)))))))
       definition
     (signal 'parsebib-error (list (point) "Malformed @String definition"))))
 
@@ -531,7 +531,7 @@ See the function `parsebib--post-process' for details."
                           (parsebib--post-process field strings replace-TeX))
                         entry))
     (when parsebib-hashid-fields
-      (push (cons "=hashid=" (secure-hash 'sha256 (parsebib--get-hashid-string 
fields))) entry))
+      (push (cons "=hashid=" (secure-hash 'sha256 (parsebib--get-hashid-string 
fields))) entry)) ;why are you using strings for keys?
     entry))
 
 (defun parsebib-read-string (&optional strings)
@@ -654,8 +654,8 @@ such an inheritance schema."
                                                (string-match-p (concat "\\b" 
(cdr (assoc-string "=type=" target-entry)) "\\b")
                                                                (cl-second 
elem))))
                                         inheritance)))
-               (cl-third (assoc-string "all" inheritance)))))
-           (new-fields (delq nil (mapcar (lambda (field)
+               (cl-third (assoc-string "all" inheritance))))) ;i think that 
nth 3 is slightly faster
+           (new-fields (delq nil (mapcar (lambda (field)      ;mapcan can 
avoid unnecessary consing, if that is important
                                            (let ((target-field 
(parsebib--get-target-field (car field) inheritable-fields)))
                                              (if (and target-field
                                                       (not (assoc-string 
target-field target-entry 'case-fold)))
@@ -699,7 +699,7 @@ for details.")
 Depending on the value of `parsebib-TeX-cleanup-target', add a
 face property `italic' to STR, or return it with Markdown or Org
 markup for italic text."
-  (pcase parsebib-TeX-cleanup-target
+  (pcase parsebib-TeX-cleanup-target   ;perhaps `pcase-exhaustive'?
     ('display (propertize str 'face 'italic))
     ('markdown (concat "*" str "*"))
     ('org (concat "/" str "/"))
@@ -716,7 +716,7 @@ markup for bold text."
     ('org (concat "*" str "*"))
     ('plain str)))
 
-(defun parsebib--convert-tex-small-caps (str)
+(defun parsebib--convert-tex-small-caps (str) ;why define this function at all?
   "Return STR capitalised."
   (upcase str))
 
@@ -868,8 +868,8 @@ See `parsebib-TeX-markup-replacement-alist' and the function
                                      "{" (group-n 2 (0+ (not (any "}")))) (opt 
"}"))
                                   (group-n 3 letter))))))
     (parsebib--TeX-replace-literal
-     . ,(rx-to-string `(or ,@(mapcar #'car 
parsebib-TeX-literal-replacement-alist)
-                           (1+ blank)))))
+     . ,(rx (or (regexp (regexp-opt (mapcar #'car 
parsebib-TeX-literal-replacement-alist))) ;this should avoid the runtime rx 
dependency
+                (1+ blank)))))
   "Alist of replacements and strings for TeX markup.
 This is used in `parsebib-clean-TeX-markup' to make TeX markup more
 suitable for display.  Each item in the list consists of a replacement
@@ -940,7 +940,7 @@ Return a list of strings, each string a separate @Preamble."
     (goto-char (point-min))
     (let (res)
       (cl-loop for item = (parsebib-find-next-item)
-               while item do
+               while item do           ;the `when' confused me, why not pull 
it into the next `when'?
                (when (cl-equalp item "preamble")
                  (push (parsebib--@preamble) res)))
       (nreverse res))))
@@ -1227,11 +1227,10 @@ field, a `parsebib-error' is raised."
 ENTRY is a CSL-JSON entry in the form of an alist.  ENTRY is
 modified in place.  Return value is ENTRY.  If YEAR-ONLY is
 non-nil, date fields are shortened to just the year."
-  (mapc (lambda (field)
-          (unless (stringp (alist-get field entry))
-            (setf (alist-get field entry)
-                  (parsebib-stringify-json-field (assq field entry) 
year-only))))
-        (mapcar #'car entry))
+  (dolist (field entry)
+    (unless (stringp (alist-get (car field) entry))
+      (setf (alist-get (car field) entry)
+            (parsebib-stringify-json-field (assq (car field) entry) 
year-only))))
   entry)
 
 (defvar parsebib--json-name-fields  '(author
@@ -1280,15 +1279,15 @@ Braced occurrences of the keys in ITEMS are replaced 
with the
 corresponding values.  Note that the keys in ITEMS should be
 symbols."
   (cl-flet ((create-replacements (match)
-                                 (save-match-data
-                                   (string-match 
"{\\([^A-Za-z]*\\)\\([A-Za-z][A-za-z-]+\\)\\([^A-Za-z]*\\)}" match)
-                                   (let* ((pre (match-string 1 match))
-                                          (key (match-string 2 match))
-                                          (post (match-string 3 match))
-                                          (value (alist-get (intern key) 
items)))
-                                     (if value
-                                         (format "%s%s%s" pre value post)
-                                       "")))))
+              (save-match-data
+                (string-match 
"{\\([^A-Za-z]*\\)\\([A-Za-z][A-za-z-]+\\)\\([^A-Za-z]*\\)}" match)
+                (let* ((pre (match-string 1 match))
+                       (key (match-string 2 match))
+                       (post (match-string 3 match))
+                       (value (alist-get (intern key) items)))
+                  (if value
+                      (format "%s%s%s" pre value post)
+                    "")))))
     (replace-regexp-in-string "{.*?}" #'create-replacements template nil t)))
 
 (defun parsebib-stringify-json-field (field &optional short)
@@ -1436,24 +1435,23 @@ details.  If FIELDS is nil, all fields are returned."
       (setq strings (make-hash-table :test #'equal)))
   (when (stringp files)
     (setq files (list files)))
-  (mapc (lambda (file)
-          (with-temp-buffer
-            (insert-file-contents file)
-            (cond
-             ((string= (file-name-extension file t) ".bib")
-              (parsebib-parse-bib-buffer :entries entries
-                                         :strings strings
-                                         :expand-strings display
-                                         :inheritance display
-                                         :fields fields
-                                         :replace-TeX display))
-             ((string= (file-name-extension file t) ".json")
-              (parsebib-parse-json-buffer :entries entries
-                                          :stringify display
-                                          :year-only display
-                                          :fields (mapcar #'intern fields)))
-             (t (error "[Parsebib] Not a bibliography file: %s" file)))))
-        files)
+  (dolist (file files)
+    (with-temp-buffer
+      (insert-file-contents file)
+      (cond
+       ((string= (file-name-extension file t) ".bib")
+        (parsebib-parse-bib-buffer :entries entries
+                                   :strings strings
+                                   :expand-strings display
+                                   :inheritance display
+                                   :fields fields
+                                   :replace-TeX display))
+       ((string= (file-name-extension file t) ".json")
+        (parsebib-parse-json-buffer :entries entries
+                                    :stringify display
+                                    :year-only display
+                                    :fields (mapcar #'intern fields)))
+       (t (error "[Parsebib] Not a bibliography file: %s" file)))))
   entries)
 
 (provide 'parsebib)
The question then is if we want to wait until the package is added to
the core and track that or add it to ELPA right now?

>
> I'll let Ihor chime in with his motivations for making the initial
> suggestion, but to make the case, a few things to note:
>
> - `parsebib` is a library for parsing .bib files, i.e., it is not a major
>   mode for editing them. IOW, it's not user-facing, it's intended for
>   developers. (Also, it doesn't compete with `bibtex.el`.)
> - There is, therefore, no machinery to check for the validity of a .bib
>   entry (e.g., whether the entry type is valid, whether all required fields
>   are present etc.) Syntactically incorrect entries *are* rejected, of
>   course.
> - What `parsebib` does is read the contents of a .bib file and return it as
>   a hash table (plus, optionally some additional data structures).
> - Optionally, entries can be 'prettified', which means that @String
>   abbreviations can be expanded, crossref's can be resolved (i.e., if an
>   entry has a crossref field, the contents of the cross-referenced entry is
>   added to the cross-referencing entry), and LaTeX commands can be removed
>   or replaced with something better-looking. (These prettifications aren't
>   applied to the .bib file, of course: they're meant for displaying the
>   .bib entries to the user.)
> - The parser is a recursive-descent parser.[2]
> - `parsebib` can also parse CSL-JSON files (using Emacs' built-in JSON
>   support).
>
> One question that would arise is obviously compatibility with `bibtex.el`.
> `parsebib` tends to favour biblatex over BibTeX, which IIUC, is not the
> case for `bibtex.el` (CC: Roland Winkler).
>
> From a copyright point of view: most of the code is mine, and I have signed
> the copyright assignment form. The code that handles prettifying LaTeX code
> comes from two other contributors, who both have signed copyright
> assignments as well. There are several smaller contributions, which,
> however, appear to fall under the 15 LOC limit. (Though obviously that
> would need to be reviewed more thoroughly.)
>
> WDYallT?
>
>
> Joost
>
>
>
>
>
> Footnotes:
>
> [1]  https://github.com/joostkremers/parsebib
>
> [2]  It's my first attempt at writing an RDP, so it's probably not as
>      optimised as it can be. That's something I could work on.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]