[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] change file formats to XML
From: |
Greg Chicares |
Subject: |
Re: [lmi] change file formats to XML |
Date: |
Tue, 16 Mar 2010 04:07:49 +0000 |
User-agent: |
Thunderbird 2.0.0.23 (Windows/20090812) |
On 2010-02-26 14:56Z, Vaclav Slavik wrote:
>
> I uploaded the XML file formats patch to Savannah:
>
> https://savannah.nongnu.org/patch/index.php?7101
I committed the heart of this on 20100316T0157Z in revision 4783.
The remainder is fairly straightforward, but I plan to commit it
in discrete steps, one product-file extension at a time, because:
- I want to make changes to the ancient 'ihs_*' files, and
- we have to migrate a lot of (proprietary) commentary
> Some comments about the patch:
>
> (0) The API for XML formats I/O is in xml_serialize.hpp.
Which is in svn now.
> By default,
> stream operator<< and operator>> are used for storing value types, but
> it can be -- and is -- customized by specializing
> xml_serialize::type_io<> template.
The basic architecture you laid out can't be improved AFAICT.
Some of the specializations are no longer required because value_cast<>()
is used instead of just stream inserters and extractors:
- value_cast<>() already has an optimization for "converting" from
std::string to std::string. (That's just an incidental benefit;
the main reason for using value_cast is numerics.)
- For mc_enum types, I thought it better to use the (heavy) mc_enum
instead of the underlying (light) C enum; that doesn't add much
overhead in 'ihs_rnddata.cpp', and it writes (clear) names in
the xml instead of (obscure) integers, e.g.:
<specamt>
<decimals>0</decimals>
<style>Upward</style>
</specamt>
while simplifying 'xml_serialize.hpp'.
- Somehow I had difficulty understanding the use of the convenience
wrappers within the 'xml_serialize.hpp' implementation itself.
That's just a personal problem I had following recursion through
template specializations; but I wrote a few things out inline to
make it easier for me to understand.
- As for containers [I'm sure the specialization given works for
sequences, but I'm not sure about other containers]...there was
an asymmetry that troubled me: from_xml() called clear() on the
container, but to_xml() didn't do anything similar. I know you
had to do this in order to keep your patch simple by avoiding
major changes to other code, but in this case I see I really do
need to change that other code: as a temporary kludge, I added
axis_lengths.clear();
right after the using-directive in TDBValue::read(), and then
removed the clear() call in from_xml(), and also added code to
the generic container specialization to assert that we start
with an empty container or an empty set of <item> elements.
This was more satisfying (and also easier) than documenting
the reason for the lack of parallelism (namely, a questionable
original design in 'ihs_dbvalue.cpp').
> I'll focus on the performance later if needed.
Well, if you notice anything I've done that seems pessimal,
please point it out; otherwise, we can work on performance later.
> For now, I'd like to know
> what you think about this approach globally, and the serialization API
> in particular.
The way you did it is better than all the alternatives I explored,
and I tried hard to find a simpler way.
> (5) Versioning:
>
> I opted for implicit versioning. As long as there's just one version of
> the format, no versioning code is needed. If the format changed, then
> the respective type_io<T>::from_xml() would be updated to deal with it.
> For example:
>
> - When a new field is added and has a sensible default, load with with
> get_property(node, "foo", foo, default_foo_value). Nothing else is
> needed to read both versions. In the other direction, unrecognized
> properties are simply ignored.
>
> - If a field semantics changes, rename it in the format. Then, read
> (and correctly interpret) the old field only if the new field isn't
> found.
>
> - If needed, we could add explicit versions on serialized keys, e.g.
>
> <coi_rate version="2">
> <decimals>8</decimals>
> <style>Downward</style>
> </coi_rate>
>
> (We'd have to add code for reading the attribute later.)
I'm trying to rewrite the old product-file 'ihs_*.?pp' stuff along
the same lines as classes Input, Ledger, and mec_input--which already
deal with the two problems you point out:
> - If a field semantics changes
Simple example: suppose gender was {M|F|U} and we want to change it
to {Male|Female|Unisex}. That's what Input::RedintegrateExAnte()
is for: the xml element's text contents have to be translated to the
current style before they're read into variables.
> - When a new field is added and has a sensible default
Simple example: we used to store a person's name in {first, middle, last}
pieces, but then decided to keep the whole name in one string; so, when
reading an old file, we fetch the pieces and combine them. That's what
Input::RedintegrateExPost() is for: here, all elements' text contents
remain valid, but the elements themselves have changed.
It's a considerable effort, but those old source files really stand
in need of a thorough rewrite anyway, and I think it'll serve us well
to use one approach everywhere for backward compatibility.
- Re: [lmi] change file formats to XML, (continued)
Re: [lmi] change file formats to XML, Greg Chicares, 2010/03/07
Re: [lmi] change file formats to XML, Greg Chicares, 2010/03/08
Re: [lmi] change file formats to XML,
Greg Chicares <=