[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#44462: Problem with get_multilibs on macOS
From: |
Fred Wright |
Subject: |
bug#44462: Problem with get_multilibs on macOS |
Date: |
Sun, 8 Nov 2020 16:07:35 -0800 (PST) |
User-agent: |
Alpine 2.23 (OSX 453 2020-06-18) |
On Thu, 5 Nov 2020, Jacob Bachmeyer wrote:
Fred Wright wrote:
While investigating some issues in the libffi tests, I came across some
cases where errors are caused by DejaGnu's get_multilibs function.
First, thank you for reporting this issue. Mac OS X is beyond the set of
systems where we can run direct tests, so our awareness of problems with
DejaGnu on Mac OS X depends on bug reports from users.
For some unclear reason, this only seems to get invoked for C++ tests, but
when it is, there can be a problem, because under some circumstances it
invokes the compiler with the -dumpspecs option, which is a gcc-only option
that clang chokes on.
The irregular usage with only C++ tests seems very odd to me, and I will look
further into this as time permits. This may be a bug in the libffi testsuite
or a bug in DejaGnu.
Even when it's using gcc, it bloats the logfile with the verbose -dumpspecs
output, in order to determine something of highly questionable value.
That value may be less questionable for some targets, or for tests that need
to be run across each multilib target a GCC instance supports.
Even in the useful multilib cases, making that behavior unconditional
seems like a bad idea, since the client code may have its own iteration
across architectures, and doing both would either not work or explode the
multi-architecture handling from O(N) to O(N^2).
I traced the results of get_multilibs (as used by the libffi tests) in many
macOS versions, and even the "successful" cases seem to have questionably
useful results:
[...]
The "/usr/." result returned on 10.4, 10.5, and 32-bit 10.6 is the same as
what I see on Ubuntu 14.04, CentOS 7, and Fedora 25, though it's not clear
to me what it's supposed to represent.
I have a suspicion that this feature is designed to support testing with an
"experimental" compiler build that is not installed on the system and may be
useless with system compilers generally, or with Apple's compilers
specifically, if Apple does not use multilib.
Apple supports the concept of multi-architecture binaries, but not in the
same way that multilib does it (AFAIK). Macs can have "universal"
binaries, which are archives combining multiple per-architecture slices.
This is applicable to object files, shared libraries, and executables. If
the build setup allows it, it can be as simple as including multiple
architecture options in the compile command. E.g.:
cc -arch x86_64 -arch i386 -o hello hello.c
Under the hood, the compiler driver runs a separate compile/assemble for
each architecture, and then combines the object files. The linker
supports universal binaries directly.
With this arrangement, architecture-related conditionals in the source
code work just fine, but what *doesn't* work is having
architecture-related parameters in a configure script, which is
unfortunately not as uncommon as it ought to be.
Also note that when I build and test a 32-bit version of libffi on 10.9,
the get_multilibs result is still "./x86_64".
As I do not have access to Mac OS X, I cannot directly explain or verify
this, but I will look into get_multilibs, in part to document it in a future
release.
Since get_multilibs already has code to return an empty string in the
"remote" case (where it assumes this function won't work), I just added
code to unix.exp to set multitop to "" for all "darwin" targets, thereby
short-circuiting almost all og get_multilibs. That certainly fixes the
problem with the libffi tests, and doesn't change any non-Mac behavior,
though I don't know if that's the ideal fix. The whole get_mutilibs
function looks pretty ugly anyway, and it's generally recognized that
relying on -dumpspecs is a bad idea.
It is most certainly not ideal. A better solution is probably to add a test
to get_multilibs to return an empty string if the compiler is not GCC. Of
course, if another compiler pretends to be GCC enough to pass that check, but
does not actually implement -dumpspecs, that is not our bug.
Limiting it to gcc would avoid actual failures, but wouldn't avoid
bloating the logfile with the humongous -dumpspecs output in the many
cases where the multilibs action isn't even wanted.
The meaning of "pretends to be gcc" isn't well-defined. It's not uncommon
to have a compiler named "gcc" which is really clang, largely because
there are so many projects that think that all compilers of interest are
named "gcc". And of course, clang tries to be highly gcc-compatible, to
facilitate switching to it, but not to the extent of implementing
-dumpspecs, which is is derived purely from gcc's internal implementation
details, and was never intended to be used in this fashion.
The libffi test suite comes up with a "compiler_vendor" variable which
seems to be able to distinguish clang from gcc, though I haven't looked at
the details.
Fixing get_multilibs properly would probably mean making it both highly
platform-specific and optional.
This issue on Mac OS X will probably be a known bug in 1.6.3 and fixed in
1.6.4.
I primarily tested my patch against the 1.6.2 release, since the current
master won't install from a non-git directory, and also has multiple
failures in its own tests (even on Linux). The patch is nearly identical
between the 1.6.2 and master cases, anyway.
Are we looking at the same current master? I have commit
3d62df24deedfb3c7c3e396a31b8ce431138eb49 here and all of the tests pass.
****These other problems are potential release blockers for 1.6.3.****
Can you file another bug report with the test failures and more information
about these issues?
I looked into this more closely and it's probably related to the non-git
issue. When running from a non-git directory, the configure script
reports a "fatal" error, but then goes on to complete with a zero exit
status and a more or less buildable setup, so you have to be paying close
attention to the output to notice.
If this is a typical hack to provide git-based extra information in
between-release version strings, it should have a fallback for the non-git
case. Consider the case of pushing all the git-tracked files to a test
system with git ls-files and rsync.
I can send the current patch, either as a bare email or as an attachment.
AFAIK, Savannah doesn't have the pull request / merge request concept.
This will need to be fixed in libgloss.exp, not unix.exp. I am putting my
foot down on fixing bugs in DejaGnu's own tree directly instead of hacking
around them like that.
Well, OK, but there seem to be other similar hacks in unix.exp, and if the
idea is that get_multilibs is completely useless on the Mac (which appears
to be the case with the current implemenation, anyway), then disabling it
in the target-related code doesn't seem unreasonable.
Fred Wright