lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] A use case for long double


From: Greg Chicares
Subject: [lmi] A use case for long double
Date: Sat, 30 Apr 2022 17:46:05 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

In preparation for migrating lmi releases from 32- to 64-bit binaries,
I've been reconsidering lmi's use of type 'long double'. I postulate
that 'long double' should not be used in place of 'double' without a
convincing rationale, because it's less common in practice and because
it's presumably slower for x86_64.

[We already decided not to consider '-mfpmath=sse+387':
    https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
  "Use this option with care, as it is still experimental"
and instead just to use the x86_64 default, which is 'sse'. AIUI,
the x87 hardware is still used when we perform long double
calculations, but in no other case.]

I reimplemented certain compound-interest calculations in terms of
'double', because they don't need 'long double'. The reason is that
we use expm1() and log1p(), which give almost the same results for
binary64 as for binary80. BTW, an implementation in terms of pow()
or of repeated multiplication (as given in unit tests) is much more
accurate if it uses extended precision, but expm1() and log1p() make
that irrelevant here.

I had anticipated that IRR calculations would be faster, though
somewhat (but perhaps tolerably) less accurate using binary64.
However, see:
  
https://git.savannah.nongnu.org/cgit/lmi.git/commit/?h=odd/eraseme_long_double_irr
It looks like we should keep the existing binary80 IRR code, because
it's no slower, and achieves an extra two digits of precision in a
not-implausible test case.

The apparent lack of a speed penalty came as a surprise to me,
but we follow the evidence wherever it may lead.

It would be conceivable to do IRR calculations using expm1() and
log1p(), but that doesn't seem attractive. The principal part of
the calculation is evaluation of NPV, the inner product of a stream
of values ("cash flows") and a vector of powers (1+i)^n, n=0,1,2...
for which Horner's rule takes n multiplications and n additions.
We could generate the power vector with n F2XM1 and n FYL2XP1
instructions, in addition to which we'd still need all the
operations of Horner's rule; but even if those transcendental
instructions took only one cycle apiece (not even close!), that
still wouldn't be a win. Here, binary80 seems best.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]