lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] MinGW-w64 anomaly?


From: Vadim Zeitlin
Subject: Re: [lmi] MinGW-w64 anomaly?
Date: Wed, 21 Dec 2016 15:23:11 +0100

On Wed, 21 Dec 2016 02:09:09 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-12-20 16:24, Vadim Zeitlin wrote:
GC> > On Tue, 20 Dec 2016 02:41:20 +0000 Greg Chicares <address@hidden> wrote:
GC> > 
GC> > GC> On 2016-12-19 00:38, Vadim Zeitlin wrote:
GC> > GC> > On Sun, 18 Dec 2016 22:18:07 +0000 Greg Chicares <address@hidden> 
wrote:
GC> > GC> [...]
GC> > GC> > GC> Yet I really do want long doubles.
GC> > GC> > 
GC> > GC> >  This is the part which worries me because it means that either lmi 
will
GC> > GC> > need to continue to use x87 forever or will need to using something
GC> > GC> > non-standard like __float128. Shouldn't 52 bits of precision be 
enough for,
GC> > GC> > well, basically anything?
GC> > GC> 
GC> > GC> Extended precision does have its advantages, of course, and the x87
GC> > GC> stuff is still in hardware, so why not use it where appropriate?
GC> > 
GC> >  They may well be both in the hardware, but from what I understand SSE is
GC> > much faster for typical code. And, AFAIK, you only get (potentially huge,
GC> > like orders of magnitude) gains from auto-vectorization when using SSE.
GC> 
GC> I'd like to measure the effect on our 32-bit msw production release.
GC> How might I do that?

 I think the first step would be for me to finish my changes making things
work with SSE (or, rather, with any standard-compliant C++11
implementation) by getting rid of all x87-specific code "properly". This
would require some effort, however, because currently things definitely do
not work in this case, as witnessed by the test failure.

GC> I'm guessing I'd rebuild lmi's own code (but not
GC> necessarily any library) adding some option to CFLAGS and CXXFLAGS, like
GC>   -msse2 -mfpmath=sse
GC> which might not work with lmi's x87 code, or
GC>   -msse2 -mfpmath=sse,387
GC> which is still "experimental". What would you recommend?

 The former would be the right thing to do, except that it doesn't work
right now. I haven't tried the latter option (I admit I am a bit afraid of
experimental compiler switches), so I could be wrong, but I don't see how
could it work neither because if any SSE instructions at all are used, they
would give wrong results for the "current" rounding mode which doesn't
apply to them.

GC> Oh...wait...no "=" is wanted after a one-hyphen "short" option:
GC>   git log -G='perform_fabs'   <-- fails
GC>   git log -G'perform_fabs'    <-- works

 Yes, this is annoying :-( I got bitten by it already too. I don't have
any good advice here, sorry.

GC> > The following trivial change is needed to fix it:
GC> [...]
GC> > -        rel_error = detail::perform_fabs
GC> > +        rel_error = std::abs
GC> 
GC> Thanks, but I really prefer to use std::fabs() for concinnity with existing
GC> lmi code. I wonder why the C++ committee made abs() a synonym: maybe just
GC> because they could? But it creates a gratuitous incompatibility with C, and
GC> if I unlearn the C distinction, then it'll be harder for me to write C.

 This is interesting to read after having read the following comment in
fenv_lmi.hpp just recently:

        /// Because this is C++, not C, no 'get-' and 'set-' lexemes are
        /// needed to simulate overloading

Isn't this exactly the same principle? In C++, "f" is not needed to
simulate overloading, so it shouldn't be used. IMO fabs() is a C-ism and
shouldn't be used in C++ code.

GC> Still, AFAICT this is the relevant change:
GC> 
GC> ----------------------------------8<----------------------------------
GC> -#else  // !(defined __GNUC__ && defined LMI_X86)
GC> -
GC> -// The round_X functions below work with any real_type-to-integer_type.
GC> -// Compilers that provide rint() may have optimized it (or you can
GC> -// provide a fast implementation yourself).
GC> -
GC> -template<typename RealType>
GC> -inline RealType perform_rint(RealType r)
GC> -{
GC> -#if defined LMI_HAVE_RINT
GC> -    return rint(r);
GC> -#else  // !defined LMI_HAVE_RINT
GC> -    throw std::logic_error("rint() not defined.");
GC> -#endif // !defined LMI_HAVE_RINT
GC> -}
GC> -
GC> -#endif // !(defined __GNUC__ && defined LMI_X86)
GC> ---------------------------------->8----------------------------------
GC> 
GC> and I'm not quite sure how replacing a cover function that simply
GC> delegated to C99 rint() with direct calls to std::rint() could
GC> make a difference

 No, sorry, this is not the right change. The change in the commit
54d250300e367b33af5719c40087e5536640fa1f which really broke it was to
remove

        template<typename RealType>
        inline RealType perform_rint(RealType r)
        {
            __asm__ ("frndint" : "=t" (r) : "0" (r));
            return r;
        }

and replace the calls to it with std::rint(). The old version worked
because it used x87 instruction which was governed by the x87 rounding
mode. The new version doesn't work because it uses the SSE instruction
(some _mm_cvtsd_xxx I think), but the rounding mode is not set for SSE
(this would require modifying MXCSR register instead of the x87 control
word).

GC> This seems to be a good time to revisit some of this twenty-year-old
GC> low-level stuff. I think I have an IEC_559 patch somewhere that you
GC> proposed about half that many years ago.

 Yes, and I still have its branch in the (old, mirrored from svn) git
repository. However it could only be used as a blueprint by now and not
applied as is, because it was using C99 functions while now we can, and
should, use C++11 ones.

GC> >  Otherwise you'd either need to revert the std::rint() change or we'll 
have
GC> > to live with completely broken 64 bit (and MSVC, which also uses SSE)
GC> > versions which is IMHO not ideal.
GC> 
GC> I'd rather not live with that much breakage. I wonder whether there's
GC> some temporary fix that would be less drastic than reverting the rint()
GC> patch mentioned above. Wait--is the whole lmi build broken for those
GC> toolchains, or only these two unit tests?

 I don't see any halfway fix, unfortunately. Either we use SSE and then we
can't use any x87-specific code, or we don't.

 And while the build, per se, is not broken, lmi is broken in the sense
that the computations it performs presumably give incorrect results. Unless
the rounding mode is never changed while doing them? But this is probably
not the case, otherwise why would we have all the code dealing with it in
the first place...


 Anyhow, for me the important question is whether you'd like me to produce
a reasonable patch (right now I just have a dirty hack) allowing the
rounding tests to pass when not using x87 by replacing x87-specific code
with the standard functions (based on/inspired by my ~10 year old IEC 559
patch) or if it's not worth doing it, either because it won't get done at
all (which would be sad) or because you prefer to do it yourself?

 Thanks in advance,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]