lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] actuarial tables format (was Re: Terse list of valuable projec


From: Václav Slavík
Subject: Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects)
Date: Sat, 24 Mar 2012 15:24:31 +0100

Hi,

On 24 Mar 2012, at 14:15, Greg Chicares wrote:
> and selecting a single line in the xml doesn't tell us much:
>  <row age="45">0.002</row>  # okay, age 45--but what gender?
> unless we comment all rows extensively:
>  <row age="45">0.002</row>  <!-- female, nonsmoker -->
> or intrude all axes into the markup as attributes:
>  <row age="45" gender="female" smoking="nonsmoker">0.002</row>
> Yet aren't those poor ideas?

Yes. In practice, I have only seen one way of representing multi-dimensional 
tables in a markup while preserving their structure, and that's the one I 
propose: nested elements. Your last example line would look similarly to this:

<gender value="female">
  <smoking value="nonsmoker">
    ...
    <row age="45">0.002</row> <!-- or <age value="45"> if you wish -->
    ...
  </smoking>
  ...other sub-tables...
</gender>
...yet more sub-tables...

Notice how _all_ selectors behave in the same manner, explicitly specifying the 
value. So everything is nicely consistent. Having attribute-less <row> would be 
an exception in the format and one that would trigger another exception — the 
need to specify min/mac age — in the enclosing element.

> Designing the xml structure so that it serves such a secondary purpose, in
> a particular restrictive case only, is arguably worse than the simplicity of
>  <row>0.002</row>
> because it makes only some tables "useful" in that way.

But aren't aggregation, select and select-and-ultimate tables the three kinds 
of tables that actuaries are used to work with? Isn't that what a human editor 
would work with to enter the data? Wouldn't it be nicer for them? I think it's 
worth having that in mind, as long as it doesn't hurt our ability to process 
the files programmatically (and none of this does as far as I can tell).

In any case, this isn't a major issue, either approach works well. I think that 
Vadim's version with an explicit 'age' attribute is better, because it's more 
consistent, readable and editable, but I don't see any fundamental problems 
with either of them.

> Anyone who really wants that will then have good reason to prefer it in all
> cases, and won't bother trying to read the xml directly.

I was under the impression that -- because we won't have a GUI editor at first 
-- these files would be maintained manually and so making them human-accessible 
is useful. Was I wrong, do you intend to switch to the new format only when we 
have a full-fledged GUI editor for it too?

>> (2) We shouldn't use this XML format for in-memory representation of the 
>> tables,
>> that would be wasteful in both time and space. Instead, we should parse the 
>> file
>> once and store the data in another data structure that is optimized for 
>> lookup.
>> Consequently, the ease of table lookup doesn't matter for the XML file 
>> format.
>> So if this would be useful enough to have, it's really no trouble to 
>> implement it.
> 
> Now I'm tending to think that an axis-value attribute like this:
>  <row age="45">0.002</row>
>       ^^^^^^^^
> isn't worthwhile. But we certainly do want an optimized data structure for
> actual use. It's premature to design that before measuring the performance.

Just to be clear, performance is not the only reason for avoiding directly 
accessing XML tree during lookup (simplicity of implementation is even more 
important to me). And even if we did access it directly, the presence or 
absence of the 'age' attribute wouldn't make any real difference.

Regards,
Vaclav




reply via email to

[Prev in Thread] Current Thread [Next in Thread]