Re: [lmi] actuarial tables format (was Re: Terse list of valuable projec

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] actuarial tables format (was Re: Terse list of valuable projec

From:	Greg Chicares
Subject:	Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects)
Date:	Sun, 22 Apr 2012 13:30:51 +0000
User-agent:	Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 2012-03-24 14:24Z, Václav Slavík wrote:
> On 24 Mar 2012, at 14:15, Greg Chicares wrote:
>> and selecting a single line in the xml doesn't tell us much:
>>  <row age="45">0.002</row>  # okay, age 45--but what gender?
>> unless we comment all rows extensively:
>>  <row age="45">0.002</row>  <!-- female, nonsmoker -->
>> or intrude all axes into the markup as attributes:
>>  <row age="45" gender="female" smoking="nonsmoker">0.002</row>
>> Yet aren't those poor ideas?
> 
> Yes. In practice, I have only seen one way of representing multi-dimensional
> tables in a markup while preserving their structure, and that's the one I
> propose: nested elements.

Okay, then let's do it that way.

> Your last example line would look similarly to this:
> 
> <gender value="female">
>   <smoking value="nonsmoker">
>     ...
>     <row age="45">0.002</row> <!-- or <age value="45"> if you wish -->
>     ...
>   </smoking>
>   ...other sub-tables...
> </gender>
> ...yet more sub-tables...
> 
> Notice how _all_ selectors behave in the same manner, explicitly specifying
> the value. So everything is nicely consistent. Having attribute-less <row>
> would be an exception in the format and one that would trigger another
> exception — the need to specify min/mac age — in the enclosing element.

Yes, consistency is good.

The SOA formats do happen to specify min and max age in a header,
but we don't have to adopt that idea.

>> Designing the xml structure so that it serves such a secondary purpose, in
>> a particular restrictive case only, is arguably worse than the simplicity of
>>  <row>0.002</row>
>> because it makes only some tables "useful" in that way.

Just to clarify the context, my concern was that explicit "age=N"
attributes would make an aggregate table immediately "make sense"
to a human reader, without any need to look elsewhere for any
other context; but that it wouldn't do so for select tables...so
would that be perceived as a shortcoming? ("Couldn't you have
made select tables just as easy to read?")

Rethinking that now, I don't seen any problem there. Select and
select-and-ultimate tables are inherently harder for humans to
read, because they aren't one-dimensional. A two-dimensional
table cannot be as easy to read as a one-dimensional table.
This is inherent in the dimensionality; it's not an artifact
of the way we represent dimensions.

> But aren't aggregation, select and select-and-ultimate tables
> the three kinds of tables that actuaries are used to work with?
> Isn't that what a human editor would work with to enter the data?
> Wouldn't it be nicer for them?

Yes to all. It would be wrong to make one-dimensional tables harder
to work with merely in order to make all tables equally difficult.

> In any case, this isn't a major issue, either approach works well.
> I think that Vadim's version with an explicit 'age' attribute is better,
> because it's more consistent, readable and editable

Agreed.

>> Anyone who really wants that will then have good reason to prefer it in all
>> cases, and won't bother trying to read the xml directly.
> 
> I was under the impression that -- because we won't have a GUI editor at
> first -- these files would be maintained manually and so making them human-
> accessible is useful.

Agreed.

> Was I wrong, do you intend to switch to the new format only when we have a
> full-fledged GUI editor for it too?

You are correct. The GUI editor is not a day-one requirement. The present
format is almost too ghastly to work with--it requires unmaintainable,
ancient tools that never were very good.

I suppose that prior to day one we will need a tool to convert from the
old binary format to the new xml format (in order to migrate several
megabytes of existing data). Anyone who wishes to use the ancient tools
to add a new table can do so, and then use the same conversion tool to
produce an xml representation. No current capability would be lost.

>>> (2) We shouldn't use this XML format for in-memory representation of the 
>>> tables,
>>> that would be wasteful in both time and space. Instead, we should parse the 
>>> file
>>> once and store the data in another data structure that is optimized for 
>>> lookup.
>>> Consequently, the ease of table lookup doesn't matter for the XML file 
>>> format.
>>> So if this would be useful enough to have, it's really no trouble to 
>>> implement it.
>> 
>> Now I'm tending to think that an axis-value attribute like this:
>>  <row age="45">0.002</row>
>>       ^^^^^^^^
>> isn't worthwhile. But we certainly do want an optimized data structure for
>> actual use. It's premature to design that before measuring the performance.
> 
> Just to be clear, performance is not the only reason for avoiding directly 
> accessing
> XML tree during lookup (simplicity of implementation is even more important 
> to me).
> And even if we did access it directly, the presence or absence of the 'age' 
> attribute
> wouldn't make any real difference.

All right.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects), Greg Chicares <=
- Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects), Václav Slavík, 2012/04/23
  - Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects), Greg Chicares, 2012/04/23
- Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects), Greg Chicares, 2012/04/23

Next by Date: Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects)
Next by thread: Re: [lmi] actuarial tables format (was Re: Terse list of valuable projects)
Index(es):
- Date
- Thread