lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] xmlwrapp '-Wconversion' warnings


From: Greg Chicares
Subject: Re: [lmi] xmlwrapp '-Wconversion' warnings
Date: Wed, 27 Mar 2019 12:33:32 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1

On 2019-03-27 10:23, Vadim Zeitlin wrote:
> On Wed, 27 Mar 2019 00:40:15 +0000 Greg Chicares <address@hidden> wrote:
[...]
> GC> /// XMLWRAPP !! xmlwrapp has no such ctor as
> GC> ///   xml::tree_parser(std::istream&)
> GC> 
> GC> Unless the idea tickles your fancy, I'll remove that comment.
> GC> The lack of such a ctor didn't prevent me from reading XML
> GC> from an istream.
[...]
> Of course, lmi probably doesn't care about either of these features, as
> XML files it deals with are (relatively) small and not compressed

Maybe some of them are compressed. We use lzma compression for MFT
files already. And, IIRC, either we distribute compressed XML files
already, or we can distribute them at any moment, knowing that they'll
be read without any code change required...again, IIRC.

>  Right now I think I'll try to add a "streaming" (i.e. reading data on
> demand) ctor from a stream object but will fall back to just the dumb
> version reading the entire stream into memory if this turns out to be too
> complicated or, on the contrary, try to add support for decompression if I
> do manage to do this quickly. Please let me know if you have any comments
> about this plan.

Sounds like the right thing to do.

> GC> /// XMLWRAPP !! It is useful to distinguish elements from DOM
> GC> /// nodes that are not elements; xmlwrapp doesn't make this
> GC> /// distinction [...]
> GC> 
> GC> That sounds potentially useful to me, but maybe that's only
> GC> because I don't understand XML and libxml2 well enough.
> 
>  It's true that xmlwrapp doesn't provide a separate xml::element type, e.g.
> as a subclass of xml::node. I'm not sure if doing this would have been a
> good idea or not, but in any case this seems to be very unlikely to happen,
> so I don't see this changing.

Okay, I'll remove that 'XMLWRAPP !!' marker and update the comment.

The reason why this seemed like a good idea to me was, IIRC, that
when I first started using xmlwrapp, I saw that Jones's examples
called some function like is_text() extremely often, and it was
cumbersome, e.g.:

  for(n : nodes) {
    // bad things happen if you ever forget to write the next line:
    if(n.is_text()) continue;
    ...

when I really wanted to do this:

  for(e : elements) {...do something...}

But, AIUI, that's just not the way libxml2 works. It might be
related to the surprising behavior described here:

  
https://ncbi.github.io/cxx-toolkit/pages/ch_xmlwrapp#ch_xmlwrapp.Formatting_of_Programmatical
| This is because the original document contained a text-node
| (the newline and space) immediately following the opening tag
| of the root element, and therefore: [...surprise...]
|
|That is how libxml2 works.
|
| While this may not be desirable in certain circumstances, there
| is no generic and reliable way to detect which text nodes are
| used for formatting, and which are meaningful content, so it’s
| not feasible to make XmlWrapp adjust ...

But there's always going to be tension between domains:
 - libxml2 works the way it does, and won't change radically
 - xmlwrapp is a fairly thin layer that presents libxml through
   a C++ interface
 - xmlwrapp users might wish for things that they imagine an
   ideally easy-to-use library could do
yet xmlwrapp was designed to make libxml2's API less clunky;
if its design goal were to make XML easy and flexible, it
would be a very different (and much slower) library.

So we just have to put up with stuff like this:

    LMI_ASSERT(1 == elements.size());
    // "*elements.begin()" because there is no front(). A 2017-01-26
    // discussion off the mailing list explained why:
    // 'The nodes returned by nodes_view::iterator are actually
    //  temporary objects as the only "real" nodes we have are
    //  xmlNodes in the tree maintained by libxml2 itself. So
    //  returning "*begin()" from front() actually results in a
    //  dangling reference'.
    *elements.begin() >> input_data_;

which is to say "you can't do that, because...well, {reasons}".

> GC> e.clear();
> GC> // XMLWRAPP !! Someday, this might be rewritten thus:
> GC> //   e.set_content(value_cast<std::string>(t).c_str());
> GC> // but for now that doesn't work with embedded ampersands.
> GC> 
> e.push_back(xml::node(xml::node::text(value_cast<std::string>(t).c_str())));
> 
>  Hmm, hasn't this been addressed by the addition of set_text_content() back
> in xmlwrapp 0.7.0? I've just applied the following patch:

Committed. Thanks.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]