bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

xml.el does incorrect entity expansion + patch


From: darkness
Subject: xml.el does incorrect entity expansion + patch
Date: Thu, 26 Sep 2002 17:35:11 -0400
User-agent: Mutt/1.4i

Version:        Emacs 21.2.1 (i386-redhat-linux-gnu 2002-04-08 as
                distributed by Red Hat)

Description:

xml.el's xml-substitute-special() makes more than one pass on
expanding entities.  I suspect this is more or less in violation of
the XML specification.

To reproduce, I stick the following XML fragment in a buffer:

<foo>&amp;amp;</foo>

Move point to the beginning of the fragment, then I M-: and type:

(xml-parse-region (point) (point-max))

The result is:

((foo nil "&"))

The "&" there should be (I think) "&amp;".  It is "&" because of:

(defun xml-substitute-special (string)
  "Return STRING, after subsituting special XML sequences."
  (while (string-match "&amp;" string)
    (set 'string (replace-match "&"  t nil string)))

in xml.el, which will expand an entity, and then proceed to expand any
new entities created by the previous expansion, and so on.  I've
tested this in one other XML parser which, as I expected, turned
"&amp;amp;" into just "&amp;".

For the record, I have verified that the md5sum's of xml.el from Red
Hat and from emacs-21.2.tar.gz are identical.  I don't believe Red Hat
has made any changes that caused this problem.

I made a patch for xml.el which appears to fix this.  I'm quite new to
elisp (Elisp?  ELISP?) so please excuse me if this patch is worthless.
(Feel free to throw out my attributation in the ChangeLog if it is
unacceptable, too.)

darky

*** ChangeLog.orig      Thu Sep 26 17:25:00 2002
--- ChangeLog   Thu Sep 26 17:27:08 2002
***************
*** 1,3 ****
--- 1,8 ----
+ 2002-09-26 darkness <darkness@caliginous.net>
+ 
+       * xml.el (xml-substitute-special): Fix errors with entity
+       expansion.
+ 
  2002-03-16  Eli Zaretskii  <eliz@is.elta.co.il>
  
        * Version 21.2 released.
*** /usr/share/emacs/21.2/lisp/xml.el   Thu Oct 18 16:19:51 2001
--- xml.el      Thu Sep 26 17:03:28 2002
***************
*** 446,463 ****
  ;;**
  ;;*******************************************************************
  
  (defun xml-substitute-special (string)
    "Return STRING, after subsituting special XML sequences."
!   (while (string-match "&amp;" string)
!     (set 'string (replace-match "&"  t nil string)))
!   (while (string-match "&lt;" string)
!     (set 'string (replace-match "<"  t nil string)))
!   (while (string-match "&gt;" string)
!     (set 'string (replace-match ">"  t nil string)))
!   (while (string-match "&apos;" string)
!     (set 'string (replace-match "'"  t nil string)))
!   (while (string-match "&quot;" string)
!     (set 'string (replace-match "\"" t nil string)))
    string)
  
  ;;*******************************************************************
--- 446,483 ----
  ;;**
  ;;*******************************************************************
  
+ ; Changed from its original version which would expand an entity,
+ ; then possibly expand a new entity which was created by the previous
+ ; expansion.  For example, "&amp;amp;" could be expanded to "&amp;"
+ ; by the first pass, but then the newly created "&amp;" would be
+ ; expanded to just "&".
+ ;
+ ; This replacement function only searches for new entities past the
+ ; point where the last entity was replaced.  It also adds an alist of
+ ; entities which could be modified by the user, though this is
+ ; probably not recommended for standard XML parsing.
+ 
+ (defvar xml-substituted-entities-alist
+   '(("&amp;" . "&")
+     ("&lt;" . "<")
+     ("&gt;" . ">")
+     ("&apos;" . "'")
+     ("&quot;" . "\""))
+   ; I'm not sure it's really called "entity expansion."
+   "alist of entities that will be expanded in HTML and
+ their expansions.")
+ 
  (defun xml-substitute-special (string)
    "Return STRING, after subsituting special XML sequences."
!   (let ((match-start 0)
!       (regexp (regexp-opt (mapcar 'car
!                                   xml-substituted-entities-alist)))
!       replacement)
!     (while (setq match-start (string-match regexp string match-start))
!       (setq replacement (cdr (assoc (match-string 0 string)
!                                   xml-substituted-entities-alist)))
!       (setq string (replace-match replacement t nil string))
!       (setq match-start (+ match-start (length replacement)))))
    string)
  
  ;;*******************************************************************




reply via email to

[Prev in Thread] Current Thread [Next in Thread]