[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Converting numbers in mortality tables to and from text
From: |
Greg Chicares |
Subject: |
Re: [lmi] Converting numbers in mortality tables to and from text |
Date: |
Fri, 18 Mar 2016 01:37:27 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.5.0 |
On 2016-03-18 00:56, Vadim Zeitlin wrote:
[...big snip...]
> Do you see some other way to avoid this problem that I'm missing?
No.
> Or are
> we ready to live with the current behaviour?
Yes.
> Notice that the program will
> round trip the files created by itself correctly (barring bugs in the
> standard library of the compiler used to build it!), this "loss" of
> precision only happens when using the existing files.
The original program, IIRC, used some antique version of turbo C++.
I wouldn't be astonished if it were compiled with emulated floating
point so that it would work on machines without math hardware. The
best thing to do about its 1-ulp errors is to ignore them, as long
as we can establish that they don't affect any "text" values. E.g.,
if an eight-decimal table contains a value that formats as
0.12345678
and the binary number in the database is
0.1234567850000001
then we have a problem, because it really does matter whether the
last decimal digit is eight or nine. But I don't suppose any tables
have a decimal-digits value of fifteen or more--they're probably no
more than eight--so a 1-ulp error can't cause an actual problem.
IOW, we treat table data as fixed-point decimal values, because
that's what they were in the original print publications. If there's
some accidental fuzziness in the binary representation, that fuzz is
noise, not data. If B is binary and T is text, then, in this chain of
transformations:
T0 [urtext in some print publication--accessible with difficulty]
-> B1 [SOA's binary files]
-> T1 [text representation we produce from B1]
-> B2 [our binary representation]
-> T2 [text representation we produce from B2]
the round-trip condition we really care about is T1 == T2. (We'll
just assume that T0 is also equal to T2, because it's prohibitively
expensive to verify each T0->B1.) As long as that condition holds,
we don't care whether B2 is exactly identical to B1.
> [*] See
> https://randomascii.wordpress.com/2013/02/07/float-precision-revisited-nine-digit-float-portability/
> for a discussion of it if you're curious but, in short, MSVC got the
> last digit wrong for exactly 4 single precision floating point numbers)
> and differently from gcc in another ~7 million cases.
It's been thirty-five years since the 8087, and C is still trying
to catch up.
[lmi] Numerics [Was: Converting numbers in mortality tables to and from text], Greg Chicares, 2016/03/24