lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] MinGW-w64 anomaly?


From: Greg Chicares
Subject: Re: [lmi] MinGW-w64 anomaly?
Date: Wed, 21 Dec 2016 02:09:09 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.4.0

On 2016-12-20 16:24, Vadim Zeitlin wrote:
> On Tue, 20 Dec 2016 02:41:20 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> On 2016-12-19 00:38, Vadim Zeitlin wrote:
> GC> > On Sun, 18 Dec 2016 22:18:07 +0000 Greg Chicares <address@hidden> wrote:
> GC> [...]
> GC> > GC> Yet I really do want long doubles.
> GC> > 
> GC> >  This is the part which worries me because it means that either lmi will
> GC> > need to continue to use x87 forever or will need to using something
> GC> > non-standard like __float128. Shouldn't 52 bits of precision be enough 
> for,
> GC> > well, basically anything?
> GC> 
> GC> Extended precision does have its advantages, of course, and the x87
> GC> stuff is still in hardware, so why not use it where appropriate?
> 
>  They may well be both in the hardware, but from what I understand SSE is
> much faster for typical code. And, AFAIK, you only get (potentially huge,
> like orders of magnitude) gains from auto-vectorization when using SSE.

I'd like to measure the effect on our 32-bit msw production release.
How might I do that? I'm guessing I'd rebuild lmi's own code (but not
necessarily any library) adding some option to CFLAGS and CXXFLAGS, like
  -msse2 -mfpmath=sse
which might not work with lmi's x87 code, or
  -msse2 -mfpmath=sse,387
which is still "experimental". What would you recommend?

> GC> Committed ae22927..83d8eb3 . Please let me know if you find any
> GC> contemporary compiler that doesn't pass the round_to unit test.
> 
>  Unfortunately it does seem to have broken things. First of all,
> round_test.cpp doesn't compile any longer because it uses
> detail::perform_fabs() expunged in 46d02c56f8d0bb5483c771a36b936d495cf8a898

Thanks, I had blithely assumed that it was unnecessary to run any other
unit test than 'round_to_test' for this set of changes, but its derivative
'round_test' was also affected.

BTW, I thought I'd be able to find the faulty commit with a command like:
  /opt/lmi/src/lmi[0]$git log -S='perform_fabs'
or at least with:
  /opt/lmi/src/lmi[0]$git log -G='perform_fabs'
but those returned nothing. What am I missing? This is my principal
working directory, and it's up to date:

  /opt/lmi/src/lmi[0]$git status
  On branch master
  Your branch is up-to-date with 'origin/master'.
...
  /opt/lmi/src/lmi[0]$git log -1
  commit 83d8eb3d37efcde2fa9e49bbd017cac85743320d

Oh...wait...no "=" is wanted after a one-hyphen "short" option:
  git log -G='perform_fabs'   <-- fails
  git log -G'perform_fabs'    <-- works

> The following trivial change is needed to fix it:
[...]
> -        rel_error = detail::perform_fabs
> +        rel_error = std::abs

Thanks, but I really prefer to use std::fabs() for concinnity with existing
lmi code. I wonder why the C++ committee made abs() a synonym: maybe just
because they could? But it creates a gratuitous incompatibility with C, and
if I unlearn the C distinction, then it'll be harder for me to write C.

>  But after fixing its compilation, this test does pass, while

[It still fails with MinGW-w64's RTL, but I have a pending fix for that.]

> round_to_test.cpp compiles just fine, but fails with 126 errors under 64
> bit Linux, which were not present before. It's not the latest commit that
> broke it, however, rather it was 54d2503 ("Presume std::rint() is available
> [424]").

I remember removing the X86 conditional recently and thinking that
(defined __GNUC__ && defined LMI_X86) had been intended as some sort
of synonym for __MINGW32__. Then, upon first reading your report, I
was struck with fear that the X86 condition had been added at your
behest, but it was there in the original CVS checkin on 2005-01-14,
so at least I didn't break something that you had fixed, and somehow
that makes me almost feel better.

Still, AFAICT this is the relevant change:

----------------------------------8<----------------------------------
-#else  // !(defined __GNUC__ && defined LMI_X86)
-
-// The round_X functions below work with any real_type-to-integer_type.
-// Compilers that provide rint() may have optimized it (or you can
-// provide a fast implementation yourself).
-
-template<typename RealType>
-inline RealType perform_rint(RealType r)
-{
-#if defined LMI_HAVE_RINT
-    return rint(r);
-#else  // !defined LMI_HAVE_RINT
-    throw std::logic_error("rint() not defined.");
-#endif // !defined LMI_HAVE_RINT
-}
-
-#endif // !(defined __GNUC__ && defined LMI_X86)
---------------------------------->8----------------------------------

and I'm not quite sure how replacing a cover function that simply
delegated to C99 rint() with direct calls to std::rint() could
make a difference, though of course I believe the evidence of your
eyes. I'd just like to be able to play with it myself; I'm hoping
that I'll be able to reproduce the problem if I build the 32-bit
msw system with SSE.

> It looks like the problem is that the rounding mode is (still) set
> using x87-specific instructions, but std::rint() uses SSE in 64 bits and so
> is not affected by it. I tried to build lmi with LMI_IEC_559 to see if it
> would fix the problem, but currently turning it on results in many
> compilation errors. I've "fixed" them by ripping out all x87-specific code
> and just using fesetround() instead and now the test indeed passes again.
> 
>  So I think the logical next step is to use fesetround() instead of
> manipulating x87 control word directly in fenv_rounding(), notwithstanding
> the comment in the beginning of fenv_lmi.cpp claiming that we shouldn't do
> this for consistency. Of course, nowadays we should use C++11 <cfenv>
> instead of C99 <fenv.h>, but this is a minor detail.

This seems to be a good time to revisit some of this twenty-year-old
low-level stuff. I think I have an IEC_559 patch somewhere that you
proposed about half that many years ago.

>  Otherwise you'd either need to revert the std::rint() change or we'll have
> to live with completely broken 64 bit (and MSVC, which also uses SSE)
> versions which is IMHO not ideal.

I'd rather not live with that much breakage. I wonder whether there's
some temporary fix that would be less drastic than reverting the rint()
patch mentioned above. Wait--is the whole lmi build broken for those
toolchains, or only these two unit tests?

> GC> Having committed this, I'm not sure it's really preferable to
> GC> the code between #if 0 ... #endif, especially if the newer code
> GC> works with fewer compilers.
> 
>  I think it's preferable if only because we've seen that the compiler will
> unroll the loop inside std::pow() efficiently while it's unlikely to do it
> for the loop in the old perform_pow().

Funny, I thought you'd say the opposite: that calculating integral
powers of "integers" manually is certain to do what we want, whereas
relying on an optional optimization in pow() (especially in light of
Howard Hinnant's example showing how it can be harmful) is dubious.

We could of course copy sgi's 'power()' source into lmi. It uses the
"Russian peasant algorithm", with only O(log n) multiplications:
  https://www.sgi.com/tech/stl/power.html#1
and it's particularly efficient for a base of ten (which, despite
the striving for generality, is the only base we actually use)
because there are only two 1's in ten's binary representation.

I think I'll do that for the #if 0 ... #endif block, and then see
which alternative we like better. BTW, g++ already has a version of
this sgi extension, under GPL3, in include/c++/ext/numeric , but
other compilers may not, so I'll use the sgi original:
  https://www.sgi.com/tech/stl/stl_numeric.h




reply via email to

[Prev in Thread] Current Thread [Next in Thread]