[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#38269: SSAX incorrect handling of > in CDATA
From: |
Andrew Gierth |
Subject: |
bug#38269: SSAX incorrect handling of > in CDATA |
Date: |
Tue, 19 Nov 2019 13:41:54 +0000 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (berkeley-unix) |
The bug:
> (xml->sxml "<e><![CDATA[>]]></e>")
$2 = (*TOP* (e ">"))
The expected result is (*TOP* (e ">")).
In upstream/SSAX.scm:
; procedure+: ssax:read-cdata-body PORT STR-HANDLER SEED
[...]
; Within a CDATA section all characters are taken at their face value,
; with only three exceptions:
[..]
; > is treated as an embedded #\> character
This handling of > is contrary to the XML specification, in which
there are no special character sequences inside CDATA except newline and
the "]]>" closing tag. I have confirmed this by checking other XML
parsers. The code seems to be based on a wild misreading of another
section of the specification that does not apply here. (And
unfortunately, the W3C validation suite for XML happens not to contain
any instances of > inside CDATA.)
I believe the fix should be as simple as removing the entire (#\&) case
from the function (and fixing the test cases).
This bug seems to exist in all versions of SSAX.
--
Andrew.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#38269: SSAX incorrect handling of > in CDATA,
Andrew Gierth <=