[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
html-parser bug
From: |
Pan Xie |
Subject: |
html-parser bug |
Date: |
Thu, 29 Jun 2023 17:28:15 +0800 |
html-parser egg can not handle spaces between html attributes and values
correctly.
Spaces are allowed around '=' in a tag's attributes even though it is a
bad practice. (see discussion in
https://stackoverflow.com/questions/7064095/spaces-between-html-attributes-and-values)
But when use the latest version of html-parser process following html:
#+begin_example
<a href="/dictionary/lineament" class = "mw_t_sx"><span
class='text-uppercase'>lineament</span></a>
#+end_example
It will generate sxml like this:
#+begin_example
(*TOP* (a (@ (href "/dictionary/lineament") (class)) "= \"mw_t_sx\">" (span (@
(class "text-uppercase")) "lineament")) "\n")
#+end_example
Since html-parser's major goal is "bug-for-bug compatibility", it should
handle the spaces in attributes correctly.
#+begin_example
$ chicken-csi -version
CHICKEN
(c) 2008-2021, The CHICKEN Team
(c) 2000-2007, Felix L. Winkelmann
Version 5.3.0 (rev e31bbee5)
linux-unix-gnu-x86-64 [ 64bit dload ptables ]
$ cat ~/.cache/chicken-install/html-parser/VERSION
"0.3"
#+end_example
Thanks
Pan
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- html-parser bug,
Pan Xie <=