[lmi] Transcendentals faster on linux than msw (wine)?

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] Transcendentals faster on linux than msw (wine)?

From:	Greg Chicares
Subject:	[lmi] Transcendentals faster on linux than msw (wine)?
Date:	Tue, 6 Oct 2020 14:16:19 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0

Vadim--Consider lmi's 'i_from_i_upper_n_over_n, implemented thus:
    // naively:    (1+i)^n - 1
    // substitute: (1+i)^n - 1 <-> std::expm1(std::log1p(i) * n)
    long double z = std::expm1l(std::log1pl(i) * n);
That seems to run much faster on GNU/Linux than on msw as "emulated"
by 'wine'. But how can that be, since this algorithm should just map
onto a small number of machine instructions and msw or 'wine' shouldn't
matter at all? Conversely, the unit test's 'i_upper_n_over_n_from_i_naive':
    T operator()(T const& i) const
        {return T(-1) + std::pow((T(1) + i), T(1) / n);}
using pow() seems three times as fast for both i686-w64-mingw32 and
x86_64-w64-mingw32 than for x86_64-pc-linux-gnu.

Here's the commit message for lmi master 09c4f0c3. I couldn't decide
how to divide it between the mailing list and the commit message, so
I've used the same text in both places [truncated to 72 columns for
git, but not here]:

Augment a unit test; reformat its output [master 09c4f0c3]

Reformatted output of {expm1, log1p} vs. pow results, to make it easier
to compare accuracy. For x87 (i686 only), the {expm1, log1p} results
differ only in the seventeenth (in)significant digit for double vs.
extended precision, so extended precision offers no material advantage
here--as expected, because the technique involves only a few floating-
point instructions.

Added tests comparing speed of {expm1, log1p} to pow for long double vs.
double arguments. For both i686-w64-mingw32 and x86_64-w64-mingw32, pow
is slightly slower, and of course far less accurate. However, for
x86_64-pc-linux-gnu, {expm1, log1p} runs three and a half times faster
than pow; and the {expm1, log1p} technique runs two and a half times
faster for double than for long double arguments.

Thus, it's okay to change from long double to double arguments and from
extended to double precision, though there's no benefit with MingW-w64;
but for x86_64-pc-linux-gnu, double arguments work much faster than long
double, without sacrificing accuracy (and extended precision is at best
discouraged for x86_64 anyway).

See the concurrent discussion on the mailing list.

Raw data: i686-w64-mingw32-gcc-8.3-win32

Speed tests:
  std::pow         1.463e-06 s mean;          1 us least of 6836 runs
  std::expm1       1.239e-06 s mean;          1 us least of 8073 runs
  double      i365 7.609e-07 s mean;          1 us least of 13144 runs
  long double i365 7.520e-07 s mean;          1 us least of 13299 runs

Daily rate corresponding to 1% annual interest, by various methods:
        000000000111111111122
        123456789012345678901
  0.0000272615520089941669031  method in production
  0.0000272615520089941669031  long double precision, std::expm1 and std::log1p
  0.0000272615520089941739887  long double precision, std::pow
  0.0000272615520089941672124  double precision, std::expm1 and std::log1p
  0.0000272615520089392049385  double precision, std::pow

x86_64-w64-mingw32-gcc-8.3 raw data:

Speed tests:
  std::pow         1.602e-06 s mean;          2 us least of 6245 runs
  std::expm1       1.304e-06 s mean;          1 us least of 7667 runs
  double      i365 7.516e-07 s mean;          1 us least of 13306 runs
  long double i365 7.630e-07 s mean;          1 us least of 13108 runs

Daily rate corresponding to 1% annual interest, by various methods:
        000000000111111111122
        123456789012345678901
  0.0000272615520089941669031  method in production
  0.0000272615520089941672124  double precision, std::expm1 and std::log1p
  0.0000272615520089392049385  double precision, std::pow

pc-linux-gnu gcc-9 raw data:

Speed tests:
  std::pow         4.565e-06 s mean;          2 us least of 2191 runs
  std::expm1       8.252e-07 s mean;          0 us least of 12119 runs
  double      i365 7.233e-08 s mean;          0 us least of 138246 runs
  long double i365 1.721e-07 s mean;          0 us least of 58098 runs

Daily rate corresponding to 1% annual interest, by various methods:
        000000000111111111122
        123456789012345678901
  0.0000272615520089941669014  method in production
  0.0000272615520089941672124  double precision, std::expm1 and std::log1p
  0.0000272615520089392049385  double precision, std::pow

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] Transcendentals faster on linux than msw (wine)?, Greg Chicares <=
- Re: [lmi] Transcendentals faster on linux than msw (wine)?, Vadim Zeitlin, 2020/10/06
  - Re: [lmi] Transcendentals faster on linux than msw (wine)?, Greg Chicares, 2020/10/06

Prev by Date: Re: [lmi] How to best merge CI configuration fixes?
Next by Date: Re: [lmi] Transcendentals faster on linux than msw (wine)?
Previous by thread: [lmi] How to best merge CI configuration fixes?
Next by thread: Re: [lmi] Transcendentals faster on linux than msw (wine)?
Index(es):
- Date
- Thread