Re: [lmi] reusing mc_enum<> for serialization into XML

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] reusing mc_enum<> for serialization into XML

From:	Greg Chicares
Subject:	Re: [lmi] reusing mc_enum<> for serialization into XML
Date:	Sat, 16 Jan 2010 17:49:17 +0000
User-agent:	Thunderbird 2.0.0.23 (Windows/20090812)

On 2010-01-14 20:57Z, Vaclav Slavik wrote:
> 
> I'm finally working on the switch to XML formats and I've hit a problem
> with enums handling that I hope you may have some insight into, being
> most familiar with mc_enum<> around here.

Jumping ahead, I'm pretty sure the best answer is ultimately to:
> (2) Change all normal enums into mc_enums everywhere.
though that can be deferred if it seems best.

It seems regrettable whenever I departed from that simple solution,
e.g. in 'mc_enum_types_aux.?pp' where we find:

  std::string mc_str(mcenum_dbopt z)
  {
      return mce_dbopt(z).str();
  }

and two other functions that are nearly identical but use different
types. As a comment in the header suggests, a template would be
better than this copy-and-paste approach, and a more-general solution
would be even better.

But it's okay to write such workarounds now, and then later refactor,
moving any new requirements into the main mc_* files when convenient.
The mc_str() thing above enabled me to delete a large amount of truly
awful legacy code, so I didn't mind introducing a small amount of
temporary ugliness (even though I haven't cleaned it up yet). And
another large amount of awful code can be removed when product files
('.db4' in particular) use xml: e.g., 'ihs*pios*', not to mention
  some_istream >> item000;
  some_istream >> item001;
  ...
  some_istream >> itemNNN;
and
  some_ostream << item000; // ...

> I started with converting the simpler file formats (.rnd and .tir) and
> hit a problem with the rounding_style enum. I don't want to store it as
> a number in XML file, because it would render the use of human-readable
> and editable XML format somewhat pointless.

Yes. And it would also be brittle: it's nice to be able to reorder
enumerators.

> My current output format looks like this:
> 
>     <withdrawal>
>       <decimals>2</decimals>
>       <style>to-nearest</style>
>     </withdrawal>
> 
> This requires per-enum serialization code along the lines of
> 
>     template<>
>     const enum_type_io_map<rounding_style>::MapEntry
>     enum_type_io_map<rounding_style>::map[] =
>         { {r_indeterminate, "indeterminate"}
>         , {r_toward_zero,   "toward-zero"}
>         , {r_to_nearest,    "to-nearest"}
>         , {r_upward,        "upward"}
>         , {r_downward,      "downward"}
>         , {r_current,       "current"}
>         , {r_not_at_all,    "not-at-all"}
>         };
> 
> The thing is, this does something that mc_enum<> already does, albeit in
> -- IMHO -- more verbose way.

Yes, it's more verbose, but it does other things, too.

> My first idea was to add mc_enum<rounding_style> and use it only for XML
> serialization, while still using rounding_style type in the code as
> before. This isn't possible without large changes to mc_enum<>, though,
> because mc_enum has many more template parameters.
> 
> So I added some boilerplate code to convert between an (otherwise
> unused) mce_rounding_style mc_enum-type and concert between it and
> rounding_style when serializing. This is IMHO too much of a hack, it
> requires too much ugly code (see e.g.
> http://review.bakefile.org/r/146/diff/1-2/#index_header -- I'm not sure
> it will be fully understandable without the rest of XML changes,
> though).

It is a hack, but that's probably the way I would approach it at
first--and refactor later when the xml code is stable. But the order
in which we do those things doesn't matter much once we've figured
out what we need mc_enum* to do (that it doesn't do already); let's
try to specify that completely.

AIUI you want to read this:
>       <style>to-nearest</style>
and transform it:
>         , {r_to_nearest,    "to-nearest"}
to the enumerator 'r_to_nearest'. I.e., given a string name, get its
corresponding enumerator. I think you can do that using class mc_enum:
    explicit mc_enum(std::string const&);
    T value() const;
e.g.
    rounding_style r = mc_enum<whatever>("to-nearest").value();
where in practice you'd typedef the 'mc_enum<whatever>' part.

IOW, I think you can already do that without adding new capabilities
to mc_enum<>, yet above you say:
>  This isn't possible without large changes to mc_enum<>, though,
so maybe I'm missing something. Or we're just speaking in different
contexts, and I'm imagining a world where mc_enum<rounding_type>
already exists and has replaced plain enums throughout lmi?

> The question is, how to best handle reading and writing of enums? That
> is, non-mc_enum<> ones, mc_enum<> types are trivial.
> 
> Following are the options that I think are available to us:
> 
> (0a) Have string<->enum translation for non-mc_enum independently of
>      mc_enum, as above. This duplicates functionality, that's bad. OTOH,
>      it's very simple, and it means that we can use symbolic names
>      ("to-nearest") when wring to XML, instead of mc_enum's
>      human-readable descriptions.

That could be an expedient first step, and we can refactor later,
because ultimately I think we should use one enumerative facility
for all enumerative data.

> (0b) Like (0a), but use this simpler mechanism in mc_enum. I.e. split
>      mc_enum into a class that does pure string<->enum conversion and
>      another class with all the extras (valid values, datum_base,
>      virtual methods).

Today we have:
                class mc_enum_base : public datum_base
  template<...> class mc_enum      : public mc_enum_base
Perhaps some other class hierarchy would be better, but first I'd
ask whether we can meet xml requirements without restructuring
these classes too radically.

> (1) Do it like I did, keep using non-mc_enum enums and convert them to 
>     mc_enum only when reading and writing XML. It's a hack, IMHO hard to
>     follow, with some boilerplate code needed per-enum. But it avoids
>     duplication.

And it can be refactored later.

> (2) Change all normal enums into mc_enums everywhere. It's clean
>     solution that avoids duplication. But it's more work and it adds
>     some overhead, because mc_enum is rather heavy class compared to 
>     C++ enums. I don't know if this overhead is significant, you
>     certainly have much clearer grasp of this than me.

It's clean...and uniform.

It's more work...but the extra work can be deferred by using (1) now.

I can't say I've measured the overhead objectively, though I've never
perceived it as a bottleneck either. I suppose you're asking about
overhead in terms of speed rather than, say, object size. In essence,
the present code looks up names this way:
  char const* c[n];
  some_enum   e[n];
  e[std::find(c, c + n, whatever) - c];
which seems like a slow way to look up "US" in a country-code list.
But I'm not sure it's slow enough to be perceptible for the enumerative
types used in the product files; if it is, we can optimize later.

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] reusing mc_enum<> for serialization into XML, Vaclav Slavik, 2010/01/14
- Re: [lmi] reusing mc_enum<> for serialization into XML, Greg Chicares <=
  - Re: [lmi] reusing mc_enum<> for serialization into XML, Vaclav Slavik, 2010/01/16
    - Re: [lmi] reusing mc_enum<> for serialization into XML, Greg Chicares, 2010/01/17

Prev by Date: Re[3]: [lmi] Wrong wxFlexGridSizer usage in about_dialog.cpp
Next by Date: Re: [lmi] reusing mc_enum<> for serialization into XML
Previous by thread: [lmi] reusing mc_enum<> for serialization into XML
Next by thread: Re: [lmi] reusing mc_enum<> for serialization into XML
Index(es):
- Date
- Thread