[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lmi] Transcendentals faster on linux than msw (wine)?
From: |
Greg Chicares |
Subject: |
[lmi] Transcendentals faster on linux than msw (wine)? |
Date: |
Tue, 6 Oct 2020 14:16:19 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 |
Vadim--Consider lmi's 'i_from_i_upper_n_over_n, implemented thus:
// naively: (1+i)^n - 1
// substitute: (1+i)^n - 1 <-> std::expm1(std::log1p(i) * n)
long double z = std::expm1l(std::log1pl(i) * n);
That seems to run much faster on GNU/Linux than on msw as "emulated"
by 'wine'. But how can that be, since this algorithm should just map
onto a small number of machine instructions and msw or 'wine' shouldn't
matter at all? Conversely, the unit test's 'i_upper_n_over_n_from_i_naive':
T operator()(T const& i) const
{return T(-1) + std::pow((T(1) + i), T(1) / n);}
using pow() seems three times as fast for both i686-w64-mingw32 and
x86_64-w64-mingw32 than for x86_64-pc-linux-gnu.
Here's the commit message for lmi master 09c4f0c3. I couldn't decide
how to divide it between the mailing list and the commit message, so
I've used the same text in both places [truncated to 72 columns for
git, but not here]:
Augment a unit test; reformat its output [master 09c4f0c3]
Reformatted output of {expm1, log1p} vs. pow results, to make it easier
to compare accuracy. For x87 (i686 only), the {expm1, log1p} results
differ only in the seventeenth (in)significant digit for double vs.
extended precision, so extended precision offers no material advantage
here--as expected, because the technique involves only a few floating-
point instructions.
Added tests comparing speed of {expm1, log1p} to pow for long double vs.
double arguments. For both i686-w64-mingw32 and x86_64-w64-mingw32, pow
is slightly slower, and of course far less accurate. However, for
x86_64-pc-linux-gnu, {expm1, log1p} runs three and a half times faster
than pow; and the {expm1, log1p} technique runs two and a half times
faster for double than for long double arguments.
Thus, it's okay to change from long double to double arguments and from
extended to double precision, though there's no benefit with MingW-w64;
but for x86_64-pc-linux-gnu, double arguments work much faster than long
double, without sacrificing accuracy (and extended precision is at best
discouraged for x86_64 anyway).
See the concurrent discussion on the mailing list.
Raw data: i686-w64-mingw32-gcc-8.3-win32
Speed tests:
std::pow 1.463e-06 s mean; 1 us least of 6836 runs
std::expm1 1.239e-06 s mean; 1 us least of 8073 runs
double i365 7.609e-07 s mean; 1 us least of 13144 runs
long double i365 7.520e-07 s mean; 1 us least of 13299 runs
Daily rate corresponding to 1% annual interest, by various methods:
000000000111111111122
123456789012345678901
0.0000272615520089941669031 method in production
0.0000272615520089941669031 long double precision, std::expm1 and std::log1p
0.0000272615520089941739887 long double precision, std::pow
0.0000272615520089941672124 double precision, std::expm1 and std::log1p
0.0000272615520089392049385 double precision, std::pow
x86_64-w64-mingw32-gcc-8.3 raw data:
Speed tests:
std::pow 1.602e-06 s mean; 2 us least of 6245 runs
std::expm1 1.304e-06 s mean; 1 us least of 7667 runs
double i365 7.516e-07 s mean; 1 us least of 13306 runs
long double i365 7.630e-07 s mean; 1 us least of 13108 runs
Daily rate corresponding to 1% annual interest, by various methods:
000000000111111111122
123456789012345678901
0.0000272615520089941669031 method in production
0.0000272615520089941672124 double precision, std::expm1 and std::log1p
0.0000272615520089392049385 double precision, std::pow
pc-linux-gnu gcc-9 raw data:
Speed tests:
std::pow 4.565e-06 s mean; 2 us least of 2191 runs
std::expm1 8.252e-07 s mean; 0 us least of 12119 runs
double i365 7.233e-08 s mean; 0 us least of 138246 runs
long double i365 1.721e-07 s mean; 0 us least of 58098 runs
Daily rate corresponding to 1% annual interest, by various methods:
000000000111111111122
123456789012345678901
0.0000272615520089941669014 method in production
0.0000272615520089941672124 double precision, std::expm1 and std::log1p
0.0000272615520089392049385 double precision, std::pow
- [lmi] Transcendentals faster on linux than msw (wine)?,
Greg Chicares <=