lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] MinGW-w64 anomaly?


From: Greg Chicares
Subject: Re: [lmi] MinGW-w64 anomaly?
Date: Sun, 25 Dec 2016 11:59:28 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.4.0

On 2016-12-22 14:31, Vadim Zeitlin wrote:
> On Thu, 22 Dec 2016 01:24:58 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> On 2016-12-21 23:43, Vadim Zeitlin wrote:
> GC> > On Wed, 21 Dec 2016 22:49:50 +0000 Greg Chicares <address@hidden> wrote:
> ...
> GC> > GC> Testing some actual computation or computations instead seems like
> GC> > GC> a strange and roundabout way of performing the same test.
> GC> > 
> GC> >  Really? It seems like much more direct way to me: after all, we're not
> GC> > interested in preserving x87 control word, we just want to have the 
> correct
> GC> > results. And for the code compiled to use SSE instead of x87 
> instructions
> GC> > these are not at all the same thing, which is the source of the problem.
> GC> 
> GC> Yet the underlying problem is avoided completely, AIUI, with SSE.
> 
>  To play the devil's advocate, in principle, it's possible for a rogue DLL
> to change the rounding mode and/or error handling bits in MXCSR register
> too, why not? So I still planned to have checks for these parts (which we
> can do using standard functions) even in non-x87 builds. But, of course, I
> wouldn't be terribly upset if we omitted them neither because in practice I
> have never heard about this happening (and FWIW I do remember hearing about
> problems due to unexpected x87 control word changes even before starting to
> work on lmi -- but this was a long, long time ago).

It's also possible in principle for another process to destroy the stack.
Only an OS that's broken by design would allow that, of course, but it's
possible.

> [...discussing how to test the changes...]
> GC> Oh. I was thinking that if we didn't do (1), then we wouldn't need (2)
> GC> to test it. Yet of course any change must be tested. Okay, I would say
> GC> that if you found no problems with the (all public) unit tests, and I
> GC> found none with the (proprietary) regression tests, that would be
> GC> strong evidence.
> 
>  I was afraid that the existing unit tests might not catch the problems due
> to lack of accuracy in the calculations, for example, so I thought that
> maybe I ought to run some illustration and check the results before and
> after the changes. But if the unit tests are sufficient, then all the
> better.
> 
>  However I still have the question about how could I measure performance of
> the code in a convenient and non-interactive way. Is there any test doing
> this? If not, maybe one should be added?

You could run 'sample.cns' and time that, e.g.:

time wine ./lmi_cli_shared --file=/opt/lmi/src/lmi/sample.cns --accept 
--ash_nazg --data_path=/opt/lmi/data

Especially with 'wine', that might measure overhead more than performance,
so a larger census would probably be better. You could make a copy of
'sample.cns' and insert a couple hundred cells, for example. That's about
as good a test of performance as the regression test suite we normally run
(which requires proprietary data and tests more paths through the code).

> GC> What if we do this:
> GC> 
> GC> - Use IEC_559 for all purposes except fenv_validate(): i.e., for
> GC>   'round_to_test' as well as any future work that needs anything in
> GC>   the scope of IEC_559, such as twiddling the rounding direction.
> GC> 
> GC> - Retain the present code for fenv_validate() purposes only: it's
> GC>   proven code, so not touching it means introducing no possible
> GC>   error; and what it does is sadly outside the scope of IEC_559.
> GC>   Conditionalize all of the 'fenv_lmi*' code, and everything that
> GC>   uses it, on LMI_MSW and !LMI_SSE in addition to all conditionals
> GC>   already used--then, instead of being an integral part of lmi, it
> GC>   becomes merely support code for class MswDllPreloader, visible
> GC>   only when building a 32-bit lmi for msw with x87.
> GC> 
> GC> Would that make both of us happy?
> 
>  I think so, thank you. But I'm not sure we really need to use LMI_MSW
> preprocessor checks: even if this code is not really needed under other
> platforms (although, again, in principle nothing prevents a GTK+ theme or
> macOS printer driver or whichever else shared library happens to be loaded
> in the process from wreaking havoc with MXCSR), there is no harm in keeping
> it there neither and fewer preprocessor checks the better IMO. The
> important thing is that it will keep testing x87 control word in the builds
> using x87 under MSW.

The problem with at least some known versions of msw is that they
failed to virtualize the x87 control word. Even though lmi runs in
its own process, loading another process could affect the CW in the
lmi process. I doubt there's any other example of an OS that fails
so insanely and goes years without being fixed. It should not be
lmi's job to protect itself against OS insanity, except in this one
known case where there was no alternative. I feel strongly that this
extraordinary fix for such a bizarre problem should be restricted to
the limited situation in which the problem is known to occur, namely,
32-bit msw only. I just find it hard to imagine that any application
checks the validity of OS context switches in idle time in any case
except 32-bit msw.

The problem you hypothesize is different: it would occur in lmi's
own process. If we ever observe that, then the solution will be to
load the offending theme or printer driver during lmi initialization,
and then establish whatever post-initialization invariants we like.
The observed msw problem is of an altogether different order of
gravity: it's a defect in the OS's context switching.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]