lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] actuarial tables format (was Re: Terse list of valuable projec


From: Greg Chicares
Subject: Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects)
Date: Sat, 24 Mar 2012 13:15:45 +0000
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 2012-03-24 09:47Z, Václav Slavík wrote:
> 
> On 24 Mar 2012, at 10:32, Greg Chicares wrote:
>> Occasionally we have a table that's amenable to run-length encoding:
>>  ages 0-4  value 0.01
>>  ages 5-9  value 0.02
>>  ...
>> but I prefer to force each consecutive value to be specified, trading
>> space for uniformity, simplicity, and robustness in the table-lookup
>> code.
> 
> Let me offer two observations about this:
> 
> (1) Use the <row age="n"> syntax, supporting this case is trivial as far as
> the format is concerned: <row min-age="5" max-age="9">.

<row age="n"> is a seductive idea, because it lets a human consult the xml
file directly and extract a useful bit of data such as
  <row age="45">0.002</row>
at least for a table that has only one axis (representing age).

But in the general case there are several axes--e.g., gender, smoking, and
age--and selecting a single line in the xml doesn't tell us much:
  <row age="45">0.002</row>  # okay, age 45--but what gender?
unless we comment all rows extensively:
  <row age="45">0.002</row>  <!-- female, nonsmoker -->
or intrude all axes into the markup as attributes:
  <row age="45" gender="female" smoking="nonsmoker">0.002</row>
Yet aren't those poor ideas?

Designing the xml structure so that it serves such a secondary purpose, in
a particular restrictive case only, is arguably worse than the simplicity of
  <row>0.002</row>
because it makes only some tables "useful" in that way. The existence of
tables that are less "useful" may seem like a shortcoming from the POV of
someone who found single-axis tables a bit too "useful", and then finds
that the same technique "doesn't work" for multiple-axis tables. It seems
better to address that as a distinct use-case through a facility to print
(or display in a GUI) one- and two-dimensional hyperplanes, e.g.:

  Table-name: "2011 mortality experience"
  Axes selected: female, nonsmoker
  Axes displayed: age, select-duration
  ... [other table-level information]
  Age Dur-> 0       1
    0  0.0123  0.0591 ...
    1  0.0047  0.0385 ...
  ...     ...     ...

Anyone who really wants that will then have good reason to prefer it in all
cases, and won't bother trying to read the xml directly.

[Here, I suspect I'm arguing against something I said days ago, but if that
must occur along the path to the best answer, so be it--ultimately we want to
arrive at the best answer, in the design phase when changes are easier.]

> (2) We shouldn't use this XML format for in-memory representation of the 
> tables,
> that would be wasteful in both time and space. Instead, we should parse the 
> file
> once and store the data in another data structure that is optimized for 
> lookup.
> Consequently, the ease of table lookup doesn't matter for the XML file format.
> So if this would be useful enough to have, it's really no trouble to 
> implement it.

Now I'm tending to think that an axis-value attribute like this:
  <row age="45">0.002</row>
       ^^^^^^^^
isn't worthwhile. But we certainly do want an optimized data structure for
actual use. It's premature to design that before measuring the performance.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]