On 12/03/2015 12:12 AM, Richard Fateman
wrote:
From S. Rump's investigations..
a "polynomial" that evaluates the same in IEEE single or double
precision, about 1.1726 but is wrong in both
rp(x,y):=1335/4*y^6+x^2*(11*x^2*y^2y^6+(121)*y^42)+11/2*y^8+x/(2*y)$
This is a great example! comes in handy when discussing precision
and accuracy in FP calculations. I have often seen people who do
numerical simulations believing that doubleprecision solves the
accuracy problems. I have written the following to some of my
friends; please forgive the parts that are obvious to people on this
list who do this stuff for living:
Over the years I discussed with all of you the pitfalls
of precision and accuracy in floating point calculations. I just
ran into a wonderful illustration how tricky this business is, so
I am sharing it with you.
As you know, each of several available FP formats is precise to a
certain number of significant digits: six for 32bit single
precision, fifteen for 64bit double precision, etc. What this
means is that, of course, a result obtained in single precision FP
arithmetic cannot be more accurate that the format's precision,
i.e. 0.001%but often the errors accumulate and the overall
accuracy is further lost due to rounding, faulty normalization and
other effects. If we didn't have this accuracy loss, we could do
our calculations with an actually existing halfprecision FP
numbers that have 3 significant digits of precision, and are very
spaceefficient (16 bits) and fast. Unfortunately, even though
most of our measurements would be accurately represented within
this 0.1% precision, the loss of accuracy in most calculations
makes this format impractical.
Normally, we deal with the accuracy loss by accepting the price of
calculating in higher precision: it's virtually impossible to get
inaccurate result when we start from the double precision's
0.00000000000001%, right? Unfortunately, it turns out that it IS
possible, even if rarelyand we can correct for that by doing
something similar to Richardson's extrapolation ( https://en.wikipedia.org/wiki/Richardson_extrapolation
): obtain the result in single precision, and then in double
precision. If the result is the same, we might be assured that
we're not losing accuracy. If they are NOT the same, we could
conclude that the double precision is the right one, or, if we're
in an exacting mood, we could actually estimate the result for
infinite precision.
Unfortunately, math is a harsh mistress and this doesn't always
work. Here's an example of a simple polynomial (you don't need any
mutihour iterative calculations) where the accuracy loss is so
severe that we're getting a wrong result for both single and
double precision.
Doubly unfortunately, the wrong result is THE SAME for both, so it
turns out that you can't even rely onĀ comparing multiple
precisions.
here's where to evaluate it, and here's the right answer...
float(rp(77617, 33096)); evaluates to 0.82739605994682
But changing the floating point precision gives different answers...
for fpprec:10 step 5 thru 50 do print (rp(77617.0b0, 33096.0b0));
1.17260394b0
1.17260394005318b0
1.8014398509481985173b16
1.172603940053178631858835b0
1.17260394005317863185883490452b0
1.1726039400531786318588349045201837b0
8.27396059946821368141165095479816291999b1
8.27396059946821368141165095479816291999033116b1
8.2739605994682136814116509547981629199903311578439b1
google will find you more info about Rump's polynomial..
using interval arithmetic with doublefloats tells you that the correct
answer is
in [ 2.7363628282260744E+22, 2.458240415520526E+22]
which is why it might make sense to use more precision.
