lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] rate_table_tool: merge a whole directory


From: Vadim Zeitlin
Subject: Re: [lmi] rate_table_tool: merge a whole directory
Date: Wed, 30 Nov 2016 16:08:24 +0100

On Wed, 30 Nov 2016 08:15:56 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-11-30 00:27, Greg Chicares wrote:
GC> [...]
GC> >   expected: 70000000 * 0.04551 = 3185700.00 exactly
GC> >   observed: 3185700.00 with old_bin [exactly as expected]
GC> >   observed: 3185699.99000000022352 with new_bin [anomaly]
GC> > 
GC> > lmi rounds the calculated seven-pay premium down to cents, so a
GC> > difference of 1 ulp in the binary number read from the table could
GC> > account for this anomaly. I imagine a 1 ulp difference could arise
GC> > in table_impl::parse_single_value() [showing only the most relevant
GC> > lines]:
GC> > 
GC> >     auto const res_int_part = strict_parse_number(current);
GC> >     auto const res_frac_part = strict_parse_number(res_int_part.end + 1);
GC> > 
GC> >     double value = res_frac_part.num;
GC> >     value /= std::pow(10, *num_decimals_);
GC> >     value += res_int_part.num;
GC> >     return value;
GC> > 
GC> > My hypothesis then, comparing floating-point numbers for strict
GC> > equality, is that:
GC> >   assert(0.04551 != 4551 / 10000)

 This is definitely true, but, AFAIR, this is exactly why I do the number
parsing in this roundabout way. Again, the details are a bit hazy by now
(and I definitely should have recorded them in a comment here), but I think
this was necessary to ensure that the text form remains unchanged after
"txt -> bin -> txt" round trip test when using 16 decimal places (this
problem is not going to appear with only 5 or 6 of them). I could try to
make a test for this, but OTOH it's clear that this is not a good approach
neither if it results in errors such as above :-(

 What I still don't understand is why the unit test test_to_from_text()
fail to catch this bug, after all this is exactly what it is testing for.
Can you make it fail by using the proprietary table in this test instead of
the ones it uses currently?


GC> > I suspect it's necessary to parse text as above in order to write
GC> > helpful diagnostics, but I think the return value should be the
GC> > applicable character substring converted by some other means, e.g.,
GC> >   value_cast<double>(std::string const&)
GC> > I can figure out what conversion we need. Vadim, can you show me
GC> > how to get a std::string object containing the formatted single
GC> > number that is to be converted (because I'm snow-blind after
GC> > staring at all these pointers)?
GC> 
GC> I think I've figured out the answer to the last question (though I
GC> haven't yet tested the conversion code in this experimental patch):
GC> 
GC> ----------8<----------8<----------8<----------8<----------8<----------
GC> diff --git a/rate_table.cpp b/rate_table.cpp
GC> index c705122..5271da3 100644
GC> --- a/rate_table.cpp
GC> +++ b/rate_table.cpp
GC> @@ -26,6 +26,7 @@
GC>  #include "alert.hpp"
GC>  #include "crc32.hpp"
GC>  #include "path_utility.hpp"
GC> +#include "value_cast.hpp"
GC>  
GC>  #include <boost/filesystem/convenience.hpp>
GC>  #include <boost/filesystem/exception.hpp>
GC> @@ -1567,6 +1568,8 @@ double table_impl::parse_single_value
GC>      ,int& line_num
GC>      )
GC>  {
GC> +    char const* beginning = current;
GC> +
GC>      // The number of spaces before the value should be at least one,
GC>      // and no greater than (gap_length, plus one if the number of
GC>      // decimals is zero, because get_value_width() assumes, contrary
GC> @@ -1653,6 +1656,12 @@ double table_impl::parse_single_value
GC>      value /= std::pow(10, *num_decimals_);
GC>      value += res_int_part.num;
GC>  
GC> +//    std::string s(start, current);
GC> +// Attempt to convert string '  0  0.25751' from type std::string to type 
double
GC> +// failed on terminal substring '  0.25751'.
GC> +    std::string s(beginning, current);
GC> +    return value_cast<double>(s);
GC> +
GC>      return value;
GC>  }
GC>  
GC> ---------->8---------->8---------->8---------->8---------->8----------

 Yes, this would work if you don't mind value_cast<> overhead. You could
also initialize "beginning" slightly later, after skipping the spaces, but
it's just a minor optimization.

GC> IOW, in this function:
GC>   double table_impl::parse_single_value
GC>     (char const* start
GC>     ,char const*& current
GC>     ,int& line_num
GC>     )
GC> at first I guessed that the 'start' argument pointed to the beginning of
GC> the single value to be parsed, but it actually points to the beginning
GC> of the current line of text, i.e., '  0  0.25751' for age 0. Instead,
GC> the current datum begins at the original value of argument 'current'.

 Yes, "start" is necessary in order to properly report column numbers.

GC> Apparently the 'start' and 'line_num' arguments are used only for
GC> writing diagnostics.

 Yes, exactly, as its comment says:

    // Helper of parse_values() parsing a single floating point value using the
    // exactly expected precision. Adjust the current pointer to advance past
    // the value parsed, the other parameters are only used for diagnostics.

GC> If that's correct, BTW, then why is 'line_num' passed by reference?

 This seems like an oversight, both this function and parse_age() don't
modify it (unlike some other parsing functions that can/do advance to the
next line), so it should be passed by value. I don't know if it's worth
making a trivial patch for this or if you prefer changing this yourself?

 Please let me know, thanks,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]