bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20339: sxml simple: sxml->xml mishandles namespaces?


From: tomas
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Date: Mon, 8 Apr 2019 14:14:03 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Feb 05, 2019 at 01:57:11PM +0100, Ricardo Wurmus wrote:
> 
> Ricardo Wurmus <address@hidden> writes:
> 
> > In that case we coud have FINISH-ELEMENT add all namespace declarations
> > that are in scope to the current node that is about to be returned.  It
> > would be a little verbose, but more correct.
> 
> Like this:

Thanks again for your patch, and sorry for my glacial pace.

I now came around to test it (against Guile 2.2.4, commit
791cae940afcb2b2eb2c167fe438be1dc1008a73).

TL;DR:

 - The default namespace is still a problem (see below)
 - It would be nice to inhibit the down-inheritance of
   namespace declararions at xml->sxml time. Then, the
   sxml representation would closely mimic the XML, this
   has obvious advantages, since it'd give the user much
   more control over the generated XML.

I'd be willing to prepare a patch along these lines, but
for that, I'd like to get an idea of which direction we
want to take this whole thing to.

To see what's going on, I tried with a small XML example:

First with explicit (aka non-default) namespace:

  #+NAME: minimal-explicit
  #+BEGIN_EXAMPLE
  <?xml version="1.0"?>
  <myns:root xmlns:myns="http://example.org/namespaces/myns";>
    <myns:subnode/>
  </myns:root>
  #+END_EXAMPLE

Before your patch:

  #+NAME: minimal-explicit-before
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-explicit
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-before
  : <stdin>:12:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:12:14: warning: possibly unbound variable `xml->sxml'
  : (*TOP* (*PI* xml "version=\"1.0\"")
  :        (http://example.org/namespaces/myns:root
  :          "\n  "
  :          (http://example.org/namespaces/myns:subnode)
  :          "\n"))

As we know, this replaces the namespace prefixes with the namespace URIs

After your patch:

  #+NAME: minimal-explicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-explicit
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-after
  #+begin_example
  <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  ;;; note: source file ./sxml/simple.scm
  ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple.go
  ;;; found fresh local cache at 
/home/tomas/.cache/guile/ccache/2.2-LE-8-3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  (*TOP* (*PI* xml "version=\"1.0\"")
         (myns:root
           (@ (xmlns:myns "http://example.org/namespaces/myns";))
           "\n  "
           (myns:subnode
             (@ (xmlns:myns "http://example.org/namespaces/myns";)))
           "\n"))
  #+end_example

(I've put sxml/simple.scm in the current directory, thus the manipulation
of %load-path). This mimics the XML more closely, using namespace prefixes
instead of namespaces in the sxml. This is compelling :-)

The only difference to the xml is that the namespace declaration is inherited
to lower-level nodes (that's why sxml->xml propagates them, too).

This works, with the above downside, which you noted too.

It doesn't work with a default namespace, though:

  #+NAME: minimal-implicit
  #+BEGIN_EXAMPLE
  <?xml version="1.0"?>
  <root xmlns="http://example.org/namespaces/myns";>
    <subnode/>
  </root>
  #+END_EXAMPLE

With your patch:

  #+NAME: minimal-implicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=minimal-implicit
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-implicit-after
  : <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  : ;;; note: source file ./sxml/simple.scm
  : ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple.go
  : ;;; found fresh local cache at 
/home/tomas/.cache/guile/ccache/2.2-LE-8-3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  : (*TOP* (*PI* xml "version=\"1.0\"")
  :        (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n"))

Note that the namespace declaration for *DEFAULT* is missing,
so we lost that bit of information. Besides, this is not
serializable:

  #+NAME: reserialize-implicit
  #+BEGIN_SRC scheme :results output verbatim
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (define the-sxml
    '(*TOP* (*PI* xml "version=\"1.0\"")
       (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n")))
  (sxml->xml the-sxml)
  #+END_SRC

It catches the bad (xml) name starting with a star:

  #+RESULTS: reserialize-implicit
  : ERROR: In procedure scm-error:
  : Invalid name starting character "*DEFAULT*" *DEFAULT*:root
  : 
  : Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
  : scheme@(guile-user) [1]> 

Cheers
-- tomás

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]