lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Mystic fix for so-attributes linker error


From: Greg Chicares
Subject: Re: [lmi] Mystic fix for so-attributes linker error
Date: Thu, 9 Mar 2017 14:59:45 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0

On 2017-03-09 13:59, Vadim Zeitlin wrote:
> On Thu, 9 Mar 2017 02:46:59 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> That is most helpful. I had forgotten that, unlike the mingw.org 
> toolchain,
> GC> MinGW-w64 pretty much forces us to enable that kludge, as your attempt to
> GC> turn it off (below) demonstrates. These LMI_SO attributes should make
> GC> everything work in the absence of that kludge, but then we can't use
> GC> MinGW-w64's libstdc++.
> 
>  After some more tests, I realized that we could use -static-libstdc++ gcc
> option to avoid this problem.

Great--I didn't realize that worked with MinGW-w64.

> However disabling auto import still doesn't
> work, with
> ---------------------------------- >8 --------------------------------------
> diff --git a/workhorse.make b/workhorse.make
> index d4f28a4..5878aa3 100644
> --- a/workhorse.make
> +++ b/workhorse.make
> @@ -675,6 +675,7 @@ endif
>  
>  ifneq (,$(USE_SO_ATTRIBUTES))
>    actually_used_lmi_so_attributes = -DLMI_USE_SO_ATTRIBUTES 
> $(lmi_so_attributes)
> +  LDFLAGS += -Wl,--disable-auto-import -static-libstdc++
>  endif
>  
>  REQUIRED_CPPFLAGS = \
> ---------------------------------- >8 --------------------------------------
> I get the following errors:
> 
> main_wx_test.o: In function `~DocManagerEx':
> /opt/lmi/src/git/docmanager_ex.hpp:47: undefined reference to `vtable for 
> DocManagerEx'
> /opt/lmi/src/git/docmanager_ex.hpp:47: undefined reference to `vtable for 
> DocManagerEx'
> main_wx_test.o: In function `~Skeleton':
> /opt/lmi/src/git/skeleton.hpp:57: undefined reference to `vtable for Skeleton'
> /opt/lmi/src/git/skeleton.hpp:57: undefined reference to `vtable for Skeleton'
> /opt/lmi/src/git/skeleton.hpp:57: undefined reference to `vtable for Skeleton'
> /opt/lmi/src/git/skeleton.hpp:57: undefined reference to `vtable for Skeleton'
> /opt/lmi/src/git/skeleton.hpp:57: undefined reference to `vtable for Skeleton'
> main_wx_test.o:/opt/lmi/src/git/skeleton.hpp:57: more undefined references to 
> `vtable for Skeleton' follow
> collect2.exe: error: ld returned 1 exit status
> /opt/lmi/src/git/workhorse.make:791: recipe for target 'wx_test.exe' failed
> make[1]: *** [wx_test.exe] Error 1
> make[1]: Target 'all' not remade because of errors.

If the only errors pertain to 'main_wx_test', then the production
system is okay and only the GUI test is affected--so this is progress.

> On one hand, this is somewhat surprising because lmi_wx_shared.exe does get
> linked successfully and I don't see any important differences between the
> command used for linking it and for linking wx_test.exe. But OTOH it looks
> perfectly normal because both of these vtables are inside skeleton.dll and
> are not DLL-exported from it, so using them should fail without
> auto-import. I.e. the surprising part is actually that linking
> lmi_wx_shared.exe succeeds, not that linking wx_test.exe fails.

Okay. It really would seem that class Skeleton should be made visible
outside its shared library (i.e., in the msw-specific case, exported
from its DLL).

> GC> That is, of course, guaranteed to work. Probably we could just
> GC>   s/class/class LMI_SO/
> GC> globally
> 
>  I don't think this can work because there are also classes which are not
> part of liblmi.dll but are compiled into skeleton.dll, e.g. DocManagerEx
> appearing in the error messages above. Using LMI_SO with this class would
> be a mistake as it would end up being DLL-imported on use and even on
> definition (because skeleton.dll is compiled with -DLMI_USE_SO), which is
> definitely wrong.
> 
>  In fact, it's really not clear to me how can a single LMI_SO work with
> more than one DLL. I believe we really need per-DLL macros, i.e. add
> LMI_SKELETON_SO governed by LMI_SKELETON_{BUILD,USE}_SO, just as the
> existing LMI_SO value is determined by LMI_{BUILD,USE}_SO. Or am I missing
> something here?

I suspect you're right. The existing structure predates the cleavage
of the skeleton shared library from the main program of which it
originally formed a part.

> GC> I used to have various versions of two non-free compilers, but they
> GC> wouldn't work with C++11. Do you have other compilers with which
> GC> you could test this, without MinGW-w64's compulsory 'auto-import'?
> 
>  I use only static libraries in my MSVC build precisely in order not to
> have to deal with these DLL complications. But I could change/add a DLL
> build which could be used to check this, of course. This would be a good
> test, I think, because MSVC doesn't do any auto-linking and is perfectly
> picky about mixed up DLL import/export declarations. Should I do this?

That's a useful idea to keep in mind, but this one may be even better,
if only because I can do it too (without accepting an unacceptable EULA
that IIRC forbids using that proprietary compiler with free software).

>  But I also thought of something else: we could also change the default
> visibility to hidden for the Unix builds. This should expose all the
> problems due to missing/wrong LMI_SO usage too because ELF linker doesn't
> do (nor even could do) any auto-linking neither. Should I do this instead
> (or in addition to)?

I think that's a great idea.

>  Of course, please keep in mind that testing either of those approaches
> will almost certainly result in errors right now, that I will want to fix
> and submit patches fixing them to you. So there is also a question of
> knowing whether you want to have to deal with discussing/applying them
> right now. I do think that the danger of breaking anything is very small
> here, i.e. if it still compiles and links, it really should work in exactly
> the same way as before too.

I can make time for this now. Changing the definition of LMI_SO for
GNU/Linux builds does not risk perturbing the production system.

> GC> > make command line, hoping that it would build faster. As often, my 
> attempts
> GC> > to save time turned out to waste a lot more of it however as I got
> GC> > 
> GC> > 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x10):
>  undefined reference to `mc_enum<rounding_style>::read(std::istream&)'
> GC> > 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x14):
>  undefined reference to `mc_enum<rounding_style>::write(std::ostream&) const'
> GC> > 
> C:/opt/lmi/MinGW-4_9_1/bin/../lib/gcc/i686-w64-mingw32/4.9.1/../../../../i686-w64-mingw32/bin/ld.exe:
>  rounding_document.o: bad reloc address 0x14 in section 
> `.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]'
> GC> > collect2.exe: error: ld returned 1 exit status
> GC> > /opt/lmi/src/git/workhorse.make:794: recipe for target 'skeleton.dll' 
> failed
> GC> 
> GC> Trying that here, I get:
> GC> 
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x10):
>  undefined reference to `mc_enum<rounding_style>::read(std::istream&)'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x14):
>  undefined reference to `mc_enum<rounding_style>::write(std::ostream&) const'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x18):
>  undefined reference to `mc_enum<rounding_style>::all_strings() const'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x1c):
>  undefined reference to `mc_enum<rounding_style>::cardinality() const'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x20):
>  undefined reference to `mc_enum<rounding_style>::enforce_proscription()'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x24):
>  undefined reference to `mc_enum<rounding_style>::ordinal() const'
> GC> 
> rounding_document.o:rounding_document.cpp:(.rdata$_ZTV7mc_enumI14rounding_styleE[__ZTV7mc_enumI14rounding_styleE]+0x28):
>  undefined reference to `mc_enum<rounding_style>::str(int) const'
> 
>  It's really strange that we're not getting the same results. I have no
> idea why I saw only the first 2 symbols and this mysterious "bad reloc"
> error afterwards which, apparently, prevented me from seeing all the other
> ones.

I don't know why we see different diagnostics. But it cannot work with
present HEAD, so I'm not gravely worried about this. It would be a real
problem if we make it work on one machine, and then it doesn't work on
another.

> GC> but what that suggests to me is that mc_enum is appropriately hidden,
> GC> but rounding_document.o accesses mc_enum's internals inappropriately.
> GC> In that case, breaking encapsulation globally would be a shame.
> 
>  AFAICS the problem is that mc_enum_base is DLL-exported but mc_enum<>
> itself is not. If we want to encapsulate all the methods above in
> liblmi.dll, then we shouldn't export mc_enum_base from it neither. But
> currently virtual methods such as all_strings(), declared with LMI_SO in
> the base class, but defined without it in the derived one, seem to create
> problems.

That's deliberate, and I tested it successfully with mingw.org's gcc
(probably most versions from 2.95 through 3.4.5) and also with comeau
and borland.

The idea is that (virtual) base::foo() is externally visible, while
(overriding) derived::foo() is not: then the external world can call
into the shared library through the base class, and the shared library
dispatches the call internally to the override.

> GC> Perhaps I should finally start doing routine GTK builds here, with ELF
> GC> visibility attributes.
> 
>  Notice that currently this wouldn't help because we don't use
> -fvisibility=hidden yet.

AIUI, we'd simply need to define LMI_SO appropriately for ELF.
I'm not sure whether the existing code is appropriate:

#       if defined LMI_BUILD_SO
#           define LMI_SO __attribute__((visibility("default")))
#       else  // !defined LMI_BUILD_SO
#           define LMI_SO
#       endif // !defined LMI_BUILD_SO

I wrote that after reading the first announcement of "visibility",
but never tested it.

> GC> At the moment, using MinGW-w64 only, I'm trying to maintain
> GC> encapsulation that I can no longer test. I still think it's
> GC> a good idea to maintain the strongest possible physical separation at
> GC> shared-library barriers. I've put considerable effort into designing
> GC> lmi this way, and I don't want to throw it overboard piece by piece.
> 
>  FWIW I agree, but with an important caveat: if it's important, it should
> be actually used, preferably by default. I.e. I think that we should either
> make USE_SO_ATTRIBUTES=1 build the default or drop it completely (the
> latter would be undesirable but IMO still better than having half-tested
> almost never used build mode which is just confusing).
> 
>  Why don't we use USE_SO_ATTRIBUTES=1 (and, ideally, --disable-auto-import)
> by default?

IIRC, the reason for this decision was that the mingw.org toolchain
worked much better this way around the turn of the century. There
should be an old message from me to the MinGW mailing list making
this case with quantitative measurements; I'm confident I could
find it if necessary. It certainly didn't have anything to do with
the auto-export misfeature; rather, it concerned "direct" linking
to an msw DLL (as opposed to linking through an import library).
IIRC, the crux was that an object file compiled without msw DLL
attributes could be linked both statically and dynamically, so
we didn't need distinct "export" and "import" versions of each
object.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]