[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [OT] Is od broken?
From: |
Eric Blake |
Subject: |
Re: [OT] Is od broken? |
Date: |
Thu, 12 Jun 2008 21:57:42 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Jim Meyering <jim <at> meyering.net> writes:
> > Unrelated to this patch: Should we use %g instead of %e for floating point?
> > Seeing 1.000000000e+00 is somewhat distracting in isolation; on the other
hand,
> > the variable-width nature of %g might not look as nice as the fixed-width
> > precision of %e.
>
> I have a slight preference for the status quo. And not having
> looked at how other vendor od programs work, I'm hesitant to change this.
>
Agreed. Here's another floating point quandary:
od --help states that it will "Write an unambiguous representation". But this
is not always true with floating point. There's the obvious case of x86 long
double - since we are converting 12 bytes in memory into 10 bytes (actually 79
bits) of significant information in register, we are discarding two bytes of
input data for every string we output. There's also the case of NaN (in IEEE
double-precision, almost 2^53 distinct values display as "nan", unless your
libc's printf is nice enough to give "nan(n-char-sequence)" output).
Then there's less obvious cases where rounding bites us. Without my patch
series, the code is blindly claiming that the field width of a 4-byte IEEE
float is FLT_DIG+8 (14 bytes) without the leading space, even though the format
will never print more than 13 non-space characters (sign, first digit, dot, 6
FLT_DIG precision digits, e, sign, then 2 digits exponent [FLT_MAX_10_EXP is
37, so we'll never see 3 digit exponents on -tf4]); so we are over-padding.
But POSIX is clear that FLT_DIG is rounded down (unless your radix is a power
of 10), to cover the decimal-binary-decimal round trip, whereas DECIMAL_DIG is
rounded up, to cover the binary-decimal-binary round trip. In the case of od,
we want the algorithm of DECIMAL_DIG if we are to guarantee uniqueness. And
notice that simply doing FLT_DIG+1 is STILL insufficient, as shown by this test
case of the four adjacent floats 123456776.0f to 123456800.0f:
$ src/od -tfFx1x4 blah
0000000 -1.2345678e+08 -1.2345678e+08 -1.2345679e+08 -1.2345680e+08
a1 79 eb cc a2 79 eb cc a3 79 eb cc a4 79 eb cc
cceb79a1 cceb79a2 cceb79a3 cceb79a4
0000020
In order to safely go binary-decimal-binary, the unique decimal representation
of 0xcceb79a2 as an IEEE single-precision float MUST be -1.23456784e+08, or
FLT_DIG+2 bytes of precision.
So, which is better, patching the code to attempt to unambiguously print all
floats, or updating the documentation to make it clear that memory
representation padding, floating point rounding, and NaNs cause inaccuracies?
And looking at that output, I need to redo my pad width rounding algorithm. It
would be nicer to consistently pad that second row as four sets of 2-2-2-1
rather than the somewhat ugly 2-2-2-2/2-2-2-2/2-1-2-1/2-1-2-1. I guess I'm
back to the drawing board for an efficient way to cleanly distribute a fraction
of a padding byte without suffering from integer overflow during the
computation.
> > Should we squash this on top of the previous patch, or keep it as a separate
> > commit?
>
> I think it's fine (and probably better, but haven't reviewed either
> carefully yet) to keep them separate.
OK, I'll keep them as separate commits. Bo inspired me, and I finally figured
out how to use repo.or.cz. Now you can do:
git fetch git://repo.or.cz/coreutils/ericb.git refs/heads/od
to see my patch series.
--
Eric Blake
- Re: [OT] Is od broken?, Eric Blake, 2008/06/11
- Re: [OT] Is od broken?, Eric Blake, 2008/06/11
- Re: [OT] Is od broken?, Eric Blake, 2008/06/11
- Re: [OT] Is od broken?, Jim Meyering, 2008/06/11
- Re: [OT] Is od broken?, Paul Eggert, 2008/06/11
- Re: [OT] Is od broken?, Eric Blake, 2008/06/11
- Re: [OT] Is od broken?, Eric Blake, 2008/06/12
- Re: [OT] Is od broken?, Jim Meyering, 2008/06/12
- Re: [OT] Is od broken?,
Eric Blake <=
- Re: [OT] Is od broken?, Bo Borgerson, 2008/06/12
- Re: [OT] Is od broken?, Jim Meyering, 2008/06/13