[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] reusing mc_enum<> for serialization into XML
From: |
Greg Chicares |
Subject: |
Re: [lmi] reusing mc_enum<> for serialization into XML |
Date: |
Sat, 16 Jan 2010 17:49:17 +0000 |
User-agent: |
Thunderbird 2.0.0.23 (Windows/20090812) |
On 2010-01-14 20:57Z, Vaclav Slavik wrote:
>
> I'm finally working on the switch to XML formats and I've hit a problem
> with enums handling that I hope you may have some insight into, being
> most familiar with mc_enum<> around here.
Jumping ahead, I'm pretty sure the best answer is ultimately to:
> (2) Change all normal enums into mc_enums everywhere.
though that can be deferred if it seems best.
It seems regrettable whenever I departed from that simple solution,
e.g. in 'mc_enum_types_aux.?pp' where we find:
std::string mc_str(mcenum_dbopt z)
{
return mce_dbopt(z).str();
}
and two other functions that are nearly identical but use different
types. As a comment in the header suggests, a template would be
better than this copy-and-paste approach, and a more-general solution
would be even better.
But it's okay to write such workarounds now, and then later refactor,
moving any new requirements into the main mc_* files when convenient.
The mc_str() thing above enabled me to delete a large amount of truly
awful legacy code, so I didn't mind introducing a small amount of
temporary ugliness (even though I haven't cleaned it up yet). And
another large amount of awful code can be removed when product files
('.db4' in particular) use xml: e.g., 'ihs*pios*', not to mention
some_istream >> item000;
some_istream >> item001;
...
some_istream >> itemNNN;
and
some_ostream << item000; // ...
> I started with converting the simpler file formats (.rnd and .tir) and
> hit a problem with the rounding_style enum. I don't want to store it as
> a number in XML file, because it would render the use of human-readable
> and editable XML format somewhat pointless.
Yes. And it would also be brittle: it's nice to be able to reorder
enumerators.
> My current output format looks like this:
>
> <withdrawal>
> <decimals>2</decimals>
> <style>to-nearest</style>
> </withdrawal>
>
> This requires per-enum serialization code along the lines of
>
> template<>
> const enum_type_io_map<rounding_style>::MapEntry
> enum_type_io_map<rounding_style>::map[] =
> { {r_indeterminate, "indeterminate"}
> , {r_toward_zero, "toward-zero"}
> , {r_to_nearest, "to-nearest"}
> , {r_upward, "upward"}
> , {r_downward, "downward"}
> , {r_current, "current"}
> , {r_not_at_all, "not-at-all"}
> };
>
> The thing is, this does something that mc_enum<> already does, albeit in
> -- IMHO -- more verbose way.
Yes, it's more verbose, but it does other things, too.
> My first idea was to add mc_enum<rounding_style> and use it only for XML
> serialization, while still using rounding_style type in the code as
> before. This isn't possible without large changes to mc_enum<>, though,
> because mc_enum has many more template parameters.
>
> So I added some boilerplate code to convert between an (otherwise
> unused) mce_rounding_style mc_enum-type and concert between it and
> rounding_style when serializing. This is IMHO too much of a hack, it
> requires too much ugly code (see e.g.
> http://review.bakefile.org/r/146/diff/1-2/#index_header -- I'm not sure
> it will be fully understandable without the rest of XML changes,
> though).
It is a hack, but that's probably the way I would approach it at
first--and refactor later when the xml code is stable. But the order
in which we do those things doesn't matter much once we've figured
out what we need mc_enum* to do (that it doesn't do already); let's
try to specify that completely.
AIUI you want to read this:
> <style>to-nearest</style>
and transform it:
> , {r_to_nearest, "to-nearest"}
to the enumerator 'r_to_nearest'. I.e., given a string name, get its
corresponding enumerator. I think you can do that using class mc_enum:
explicit mc_enum(std::string const&);
T value() const;
e.g.
rounding_style r = mc_enum<whatever>("to-nearest").value();
where in practice you'd typedef the 'mc_enum<whatever>' part.
IOW, I think you can already do that without adding new capabilities
to mc_enum<>, yet above you say:
> This isn't possible without large changes to mc_enum<>, though,
so maybe I'm missing something. Or we're just speaking in different
contexts, and I'm imagining a world where mc_enum<rounding_type>
already exists and has replaced plain enums throughout lmi?
> The question is, how to best handle reading and writing of enums? That
> is, non-mc_enum<> ones, mc_enum<> types are trivial.
>
> Following are the options that I think are available to us:
>
> (0a) Have string<->enum translation for non-mc_enum independently of
> mc_enum, as above. This duplicates functionality, that's bad. OTOH,
> it's very simple, and it means that we can use symbolic names
> ("to-nearest") when wring to XML, instead of mc_enum's
> human-readable descriptions.
That could be an expedient first step, and we can refactor later,
because ultimately I think we should use one enumerative facility
for all enumerative data.
> (0b) Like (0a), but use this simpler mechanism in mc_enum. I.e. split
> mc_enum into a class that does pure string<->enum conversion and
> another class with all the extras (valid values, datum_base,
> virtual methods).
Today we have:
class mc_enum_base : public datum_base
template<...> class mc_enum : public mc_enum_base
Perhaps some other class hierarchy would be better, but first I'd
ask whether we can meet xml requirements without restructuring
these classes too radically.
> (1) Do it like I did, keep using non-mc_enum enums and convert them to
> mc_enum only when reading and writing XML. It's a hack, IMHO hard to
> follow, with some boilerplate code needed per-enum. But it avoids
> duplication.
And it can be refactored later.
> (2) Change all normal enums into mc_enums everywhere. It's clean
> solution that avoids duplication. But it's more work and it adds
> some overhead, because mc_enum is rather heavy class compared to
> C++ enums. I don't know if this overhead is significant, you
> certainly have much clearer grasp of this than me.
It's clean...and uniform.
It's more work...but the extra work can be deferred by using (1) now.
I can't say I've measured the overhead objectively, though I've never
perceived it as a bottleneck either. I suppose you're asking about
overhead in terms of speed rather than, say, object size. In essence,
the present code looks up names this way:
char const* c[n];
some_enum e[n];
e[std::find(c, c + n, whatever) - c];
which seems like a slow way to look up "US" in a country-code list.
But I'm not sure it's slow enough to be perceptible for the enumerative
types used in the product files; if it is, we can optimize later.