lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] MinGW-w64 anomaly?


From: Vadim Zeitlin
Subject: Re: [lmi] MinGW-w64 anomaly?
Date: Tue, 27 Dec 2016 19:24:41 +0100

On Tue, 27 Dec 2016 17:22:17 +0000 Greg Chicares <address@hidden> wrote:

GC> Now, however, we're using C++11, which says [26.3.1/1] that <cfenv>
GC> "defines the macros", seemingly without qualification, except that
GC> [26.3.1/2] it "the same as Clause 7.6 of the C standard", which seems
GC> to bring in the C99 "if and only if" qualification (depending on the
GC> meaning of the word "same").
GC> 
GC> Of course, I'd prefer to define
GC>  enum e_ieee754_rounding
GC> -    {fe_tonearest  = 0x00
GC> +    {fe_tonearest  = FE_TONEAREST
GC> and so on, but I guess that's not guaranteed to work, whereas the way
GC> you wrote it is reliably correct.

 To be honest, I've just copied the existing code from fend_rounding()
itself, but planned to do exactly the above in the future. Thinking about
it again now, I realize that we indeed don't know the type of these values.
But I think

        enum e_ieee754_rounding : decltype(FE_TONEAREST)
                {fe_tonearest = FE_TONEAREST
                ...

should always work, shouldn't it?

GC> > Again, my idea is to reuse the existing LMI_IEC_559 code and always enable
GC> > it for platforms not using x87, i.e. i386 code if the symbol indicating
GC> > the use of SSE is not defined, and all the other platforms.
GC> 
GC> I guess that sounds right.

 Thanks for the confirmation!

GC> The questions I have are:
GC>
GC> - For x86_64, does gcc allow no other option than SSE?

 It does allow using -mfpmath=387, but, a bit surprisingly, it still uses
at least some SSE instructions even then. E.g. consider this simplest
possible example program:

---------------------------------- >8 --------------------------------------
#include <cmath>

int main(int argc, char* argv[]) {
    double d = 1.0;
    return std::rint(d*argc);
}
---------------------------------- >8 --------------------------------------

I get the following assembly output by default (amazingly, I recognize the
huge decimal number below by heart now: it's 0x3ff0000000000000 or floating
point 1.0, as expected):

        movabsq $4607182418800017408, %rax
        movq    %rax, -8(%rbp)
        pxor    %xmm0, %xmm0
        cvtsi2sd        -20(%rbp), %xmm0
        mulsd   -8(%rbp), %xmm0
        call    rint
        cvttsd2si       %xmm0, %eax

where all of "pxor", "mulsd" and "cvttsd2si" are SSE instructions. And with
-mfpmath=387 I get

        movabsq $4607182418800017408, %rax
        movq    %rax, -8(%rbp)
        fildl   -20(%rbp)
        fmull   -8(%rbp)
        fstpl   -40(%rbp)
        movsd   -40(%rbp), %xmm0
        call    rint
        cvttsd2si       %xmm0, %eax

so the first two instructions were replaced by x87 ones, but cvttsd2si
still remains... Of course, this is not a problem as this instruction is
not affected by the rounding mode and it will be present on all 64 bit
CPUs, so it's perfectly fine to use here. But it's still a bit strange to
have such a mix of x87 and SSE instructions. I guess the compiler knows
better than me what it's doing though.

 What I can definitely say is that using x87 in 64 bit code is very rare
and I'd be surprised if doing it didn't result in at least some
interoperability problems with other libraries (maybe including libm).

GC> - Whenever we set the SSE rounding direction, should we also set the
GC> x87 rounding direction compatibly?

 I don't think so. If we're just calling fesetround() instead of playing
games with hardware registers directly, the standard library must ensure
that everything, including trigonometric calculations, works correctly and
I have no reason to believe that libm has any bugs in this area.

 Moreover, I'm almost certain that libm doesn't use x87 fsin etc
instructions that could be affected by the x87 rounding mode in the first
place, but C functions which are supposed to be more accurate and maybe
even faster than the built-in ones. And I strongly suspect, although I'm
not quite sure, that these instructions are not affected by the current
rounding mode anyhow. So, to summarize, I don't think we should bother
doing anything else than calling fesetround().

GC> >  Should I do it, test it and submit it as a proper patch/pull request?
GC> 
GC> Yes, please.

 Will do once you commit your outstanding changes,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]