bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gettext patches for cygwin #4: accessing fields of exported structs/


From: Charles Wilson
Subject: Re: gettext patches for cygwin #4: accessing fields of exported structs/arrays
Date: Thu, 24 Nov 2005 10:39:33 -0500
User-agent: Thunderbird 1.5 (Windows/20051025)

Bruno Haible wrote:
Charles Wilson wrote:
A-HA! THERE's the compile-time error you've been seeing and I haven't! It's because I've been relying on auto-import, and you've been using declspec!

Yes. And actually I don't exactly recall which experiments I did with
the MSVC6 compiler and which with a gcc two or three years ago. The
option of using MSVC6 forced me to use __declspec, since the MSVC linker
doesn't have auto-import.

Right. The decision, then, is whether to "activate" the __declspec machinery, already added for msvc support, when building with mingw or cygwin. You've decided "yes", for a variety of reasons. Fair enough.

auto-import hides the compile-time gcc error AND causes both gcc and g++ to generate broken apps in these circumstances.

And furthermore the GNU ld doc says:
   "Use of the 'auto-import' extension will cause the text section of the
    image file to be made writable. This does not conform to the
    PE-COFF format specification published by Microsoft."
Which means that executables and DLLs compiled with this option will
consume much more memory.

Not THAT much more. As of gcc-3.3.1 on cygwin/mingw, a lot of const "variables" are placed in .rdata, which is NOT made writable. So the only extra cost of the auto-import "make .text writable" is multiple copies of *that section* (1 per process). .rdata is still shared across all processes, if I am not mistaken.

Either of these two facts is sufficient motivation for me to turn off
the auto-import.

Well, IMO that's an unwise position to take in general -- although it's probably okay for *gettext* since it only depends on libiconv and you maintain that as well. But if any of the tools depended on a third library, say zlib or libgdbm, ...

Let's pretend you don't maintain libiconv, and that libiconv as built on cygwin has DATA exports that are not declspec'ed --

Hmmm...from iconv.h:
extern int _libiconv_version;


That is, libiconv expects clients to use auto-import (which *should* be fine; after all, --enable-auto-import has been supported AND the default for over four years [see below].) Further, for the purposes of this discussion let's assume that none of the DATA exports from libiconv are "complex", so we don't muddy the water with --[en|dis]able-pseudo-runtime-reloc, etc.

So you (as gettext) disable auto-import because you're worried that in some cases it'll silently break. Cases, by the way, that are entirely under your control: e.g.

gettext code hypothetically contains:

#include <iconv.h>
int * const local_pointer_to_libiconv_version = &_libiconv_version;
  vs.
#include <iconv.h>
/* don't use 'int * const' because it breaks on Woe32 */
int * local_pointer_to_libiconv_version = &_libiconv_version;

Now, by enforcing --diable-auto-import, you've forced anybody who wants to compile gettext on cygwin to go and muck with their system's libiconv and recompile it with declspec'ing -- EVEN IF there's no real reason: gettext never ACTUALLY tickles the const/.rdata bug with respect to data imported from libiconv, because the current code doesn't, actually, try to stick &_libiconv_version into a local const variable.

IOW, in this hypothetical you will have broken gettext on cygwin, requiring the system integrator or the libiconv maintainer to change libiconv's code to suit yours (a client dictating its dependency's API?). And you'll have done so *for no good reason* -- only fear of what MIGHT happen (but doesn't with gettext's present code that interfaces with libiconv), and because of your aesthetic disapproval.

Not a good policy, in general. And frankly, it's not YOUR place to make *this* particular decision. Because using this flag can possibly force ripples across a platform, it's the platform integrator that should make this call. Imagine if the maintainers of gtk, over at gtk.org, decided to enforce --disable-auto-import for the cygwin platform. Here's the list of dependencies:

C:/cygwin/bin/cyggtk-x11-2.0-0.dll
  C:/cygwin/bin\cyggdk_pixbuf-2.0-0.dll
    C:/cygwin/bin\cygwin1.dll
      [snip]
    C:/cygwin/bin\cygglib-2.0-0.dll
      C:/cygwin/bin\cygiconv-2.dll
      C:/cygwin/bin\cygintl-3.dll
    C:/cygwin/bin\cyggmodule-2.0-0.dll
    C:/cygwin/bin\cyggobject-2.0-0.dll
  C:/cygwin/bin\cyggdk-x11-2.0-0.dll
    C:\cygwin\usr\X11R6\bin\cygX11-6.dll
    C:\cygwin\usr\X11R6\bin\cygXcursor-1.dll
      C:\cygwin\usr\X11R6\bin\cygXrender-1.dll
    C:\cygwin\usr\X11R6\bin\cygXext-6.dll
    C:\cygwin\usr\X11R6\bin\cygXfixes-3.dll
    C:\cygwin\usr\X11R6\bin\cygXft-2.dll
      C:/cygwin/bin\cygfontconfig-1.dll
        C:/cygwin/bin\cygexpat-0.dll
        C:/cygwin/bin\cygfreetype-6.dll
          C:/cygwin/bin\cygz.dll
    C:\cygwin\usr\X11R6\bin\cygXrandr-2.dll
    C:/cygwin/bin\cygpango-1.0-0.dll
    C:/cygwin/bin\cygpangoxft-1.0-0.dll
      C:/cygwin/bin\cygpangoft2-1.0-0.dll
  C:/cygwin/bin\cygatk-1.0-0.dll

So a decision over at gtk.org would ripple to the cygwin maintainers of ALL of these packages, which would then ripple to the their 'upstream' official maintainers. At the very least, to those packages that export DATA. Again, clients dictating the API/ABI of their dependencies?



However, since you maintain both gettext and libiconv, and I maintain the official cygwin packages for both gettext and libiconv, and none of the gettext tools have any additional dependencies, it's not such a big deal. You can coordinate the "real" gettext and libiconv packages, and I can coordinate the cygwin gettext and libiconv releases. So it's not an issue in this _specific_ case.



Oops, I spoke too soon: xgettext depends on expat, and expat only uses declspec if:

#ifdef XML_USE_MSC_EXTENSIONS
#define XMLIMPORT __declspec(dllimport)
#endif

It's certainly not the case that XML_USE_MSC_EXTENSIONS should be turned "on" for cygwin or mingw; they don't support __int64 or "#pragma warning" or the other things that COULD be included in the definition of Microsoft C Extensions. In fact, expat_external.h says

#if defined(_MSC_EXTENSIONS) && !defined(__BEOS__) && !defined(__CYGWIN__)
#define XML_USE_MSC_EXTENSIONS 1
#endif

Now, as it happens, expat has a PURE functional interface; it exports no variables. Thus, you can probably get away with --disable-auto-import even for xgettext. But if expat DID export DATA, then you'd be forcing the expat maintainer to change this header (for instance) to

#if defined( XML_USE_MSC_EXTENSIONS ) || defined( __CYGWIN__ )
#define XMLIMPORT __declspec(dllimport)
#endif

(Or technically, you'd force ME to try to pressure the *cygwin* expat maintainer to patch his code and release a new version)

And: gnu.gettext.GetURL.exe depends on zlib (which exposes are pure functional interface, so you get lucky there, too).



But, just think: you want to disable auto-import, so we have to go looking at the API of zlib and expat and libiconv to see if they export DATA, and if they get compiled with declspec or not -- all so that you can completely disable a feature *you're not even using in your current code*. This kind of ripple feels very very wrong to me.


============================================================================
# Add --disable-auto-import to the LDFLAGS if the linker supports it.
# GNU ld has an --enable-auto-import option, and it is the default on Cygwin
# since July 2005. But it two fatal drawbacks:

Not entirely true. "--enable-pseudo-runtime-reloc" has been the default since 7.2005. "--enable-auto-import" has been the default for much, much, longer than that. 7.2001?

#   - It produces executables and shared libraries with relocations in the
#     .text segment, defeating the principles of virtual memory.

Huh? I don't think this is phrased accurately. Maybe "reducing the memory consumption advantages to be gained by shared libraries"

#   - For some constructs such as
#         extern int var;
#         int * const b = &var;
#     it creates an executable that will give an error at runtime, rather than
#     either a compile-time or link-time error or a working executable.
#     (This is with both gcc and g++.) Whereas this code, not relying on auto-
#     import:
#         extern __declspec (dllimport) int var;
#         int * const b = &var;
#     gives a compile-time error with gcc and works with g++.
AC_DEFUN([gl_WOE32_DLL],
[
  AC_REQUIRE([AC_CANONICAL_HOST])
  case "$host_os" in
    mingw* | cygwin*)
      AC_MSG_CHECKING([for auto-import of symbols])
      AC_CACHE_VAL([gl_cv_ld_autoimport], [
        gl_save_LDFLAGS="$LDFLAGS"
        LDFLAGS="$LDFLAGS -Wl,--disable-auto-import"
        AC_TRY_LINK([], [], [gl_cv_ld_autoimport=yes], [gl_cv_ld_autoimport=no])
        LDFLAGS="$gl_save_LDFLAGS"])
      AC_MSG_RESULT([$gl_cv_ld_autoimport])
      if test $gl_cv_ld_autoimport = yes; then
        LDFLAGS="$LDFLAGS -Wl,--disable-auto-import"
      fi
      ;;
  esac
])
============================================================================

As long as the auto-import feature has these two drawbacks, it's a nice
spam-stopper for the cygwin mailing lists, against all the newbie questions
on the topic, but I cannot recommend it to serious programmers.

I think you're worrying too much about the second item (memory reqmts). But the first issue, silently broken applications, is a problem. IMO, gcc should be "taught" to take "const" variables that hold references to relocated addresses, and put THOSE into (writable) .text, not .rdata. Regular const "variables" can stay in .rdata, of course. (or, put all vars-holding-values-which-require-updates into .data, not .text nor .rdata; see above).

I agree that the goal of --enable-auto-import is the right one (because
it's precisely the limitations of Microsoft's original linker that led
to all this trouble, and modern languages need more dynamic linkers too).
But as long as the linker doesn't give an error message when something will
not work, I don't consider the implementation of --enable-auto-import
finished.

Yes.  See above.

The ld documentation lists four ways around this problem:

(1) -enable-runtime-pseudo-reloc linker switch

As I understand, -enable-runtime-pseudo-reloc only fixes one particular
problem that --enable-auto-import had in the past. The other two problems
mentioned above are still present.

True. (I say "one problem, and one disadvantage", not "two problems", but...)

(2) play games with volatility and constness; use explicit pointers, etc. This is truly ugly stuff.

Too intrusive.

Oh yeah.  And aesthetically?  Please.

(4) use a functional interface instead of trying to access variables directly. This is what I did.

Too intrusive as well.

Depends on how extensively variables are used as part of the API. On the other hand, my experience in the Bad Old Days porting readline showed me that sprinkling __declspec in front of every exported variable is rather intrusive as well -- especially as Chet rejected the patches 'cause win32 shlib support wasn't high on his list of priorities, and putting READLINE_IMPEXP in every vars' declaration was "too ugly for words" -- so I had to keep re-applying the declspec patches myself. Then along came auto-import; oh joyful, joyful day!

(3) declspec(__dll[im|ex]port) the enclosing variable. This is what you did.

It remains the only workable and recommendable solution for the moment.

No other solutions are workable? Clearly not; my patches *work*. You just don't like them. <g>

declspec is the only recommendable solution? That's an opinion (backed up with reasonable arguments, in this case, but still an opinion.) However, it's YOUR opinion, and since you da maintainer, your opinion is more relevant than, say, mine.

Okay, it appears that the g++ workaround ONLY works if the imported variable is declspec'ed. It does NOT help if you're using auto-import.

Yes, I agree.

Now about the other topic, functional interfaces:

1) The existing code is ANSI C. According to the principle "fix the
   broken code, not the correct code", it is the wrong place to
   modify po-lex.[h|c] because of problem that is in Woe32.
Ok, but my point was that (as of 0.14.5) you WERE 'fixing the correct code' in po-lex for certain platforms, with regards to po_gram_error[_at_line]:

#if (some compilers)
use 'valid ANSI C code' in a macro -- which had the effect of inducing clients to directly accesss blah blah blah
#else (other compilers)
    use function calls (also 'valid ANSI C code').

That was *existing* code in 0.14.5, not something I added. ...
I simply followed the precedent you set with the po_gram_error accesses

That was about ISO C 99 code, not ANSI C. ISO C is only 6 years old. It's
acceptable to make workarounds for compilers that don't implement ISO C 99
features (such as __VA_ARGS__ or <stdint.h>). But it's not acceptable for
me to change source code for compilers that don't implement the 15-year old
ANSI C.

Actually, I think that current cygwin/mingw gcc will handle the C99isms (e.g. in the po_gram_error* macros) just fine. In the past *I believe* that the macros induced direct access to "complex" exported vars (and some of my statements in this thread were made with that assumption) -- and THAT's why I originally coded cygwin to use the functions instead of the macros. However, in 0.14.5, it appears that the macros do NOT do that. So once again, out-of-date patches and out-of-date arguments. Sigh. I really do appreciate your patience with me.

So, okay -- po-lex.[h|c] macro-to-function change is probably not necessary for cygwin. But, we already said that: you've already decided that the macro-to-function change was _desirable_ for all platforms (incl. cygwin). As I said before, this particular part of this patch is overtaken-by-events.

But the 'access to "complex" exported variables' issue still remains with plural_tables[]. And, if one wanted to turn on --enable-pseudo-runtime-reloc (which now is the default, so you don't even need to 'turn it on'), even THIS issue would simply go away, with no code changes or declspec'ing required.

However, declspec'ing the variable (option 3 from the ld docu) should work too -- and obviously it does in your testing. *I* prefer using the auto-import features of gcc on cygwin and mingw (when they work); I find the whole

#if woe32-related && !building_static
#if building_dll
#define SOMETHING declspec(__dllexport)
#else
#define SOMETHING declspec(__dllimport)
#endif
#else
#define SOMETHING
#endif

thing ugly

Yes. Especially the "building_dll" flag is evil, because it goes against the
goals of gnulib. In gnulib, we have source code files which can be part of
many different packages. The "building_dll" could therefore be a
"building_gettextlib" in one case, "building_coreutilslib" in another,
"building_greplib" in a third one etc.

Those are naming details. I'm calling the whole structure ugly, no matter what the macros and flags are 'named'.

Functionally, the BUILDING_GETTEXTLIB doesn't bother me; clients don't need to know about it. It's the OTHER part of the structure that's awkward: clients have to know about GETTEXT_STATIC: a client must compile with -DGETTEXT_STATIC if it's going to link with -Bstatic -lintl. Ick.

With chains of dependent libraries it's even worse: proliferation of -D*_STATIC flags (and what of dependencies like this:

 foo app <---- bar lib <---- baz lib
  |
  +<---- baz lib

If foo links bar statically, then it must care to link/compile against baz in the same way that bar compiled against baz. MORE information-mingling.

With auto-import, this all goes away.

Fortunately declspec(__dllexport) is actually optional, and the following
does it as well:

#if woe32-related && !building_static
#define SOMETHING declspec(__dllimport)
#else
#define SOMETHING
#endif

Better, but it still won't win any beauty pageants; still has the GETTEXT_STATIC problem. With regards to non-gnu compilers, I don't think it can be done any more cleanly; you always will be mixing information/reqmts between compile- and link- phase.

Well, unless you want to use msvc's #pragmas. Double ick. Non portable. evil.

PS: I hope someone will also fix the other problem with --auto-imports,
namely that it makes the text segment writable. IMO the way to fix it is
to actually change the instructions that contain the references, not just
the references themselves. Such as:

    extern int a; int foo () { return a * a; }            [in the .c file]
    -->
    movl a, %eax                                          [in the .obj file]
    -->
    movl __imp_a, %eax; movl (%eax), %eax                 [in the .exe file]

Of course the instructions become longer this way; you need to rely on the
"linker relaxation" that - fortunately - is already implemented in GNU ld.


This is a much harder problem to solve, really, than simply making gcc smarter about which section it puts "const" variables into. The latter I could work on, the former is above my pay grade. :-) Plus, the latter solves an actual BUG -- currently gcc/ld on cygwin can produce non-functional exe under the appropriate conditions. The former is just an annoyance; "too many" copies of .text loaded into memory.

Plus, if you're going to modify how gcc figures out which section to put variables into, you might as well go all the way: marking .text writable was a shortcut. Why not put all variables holding values that must be updated by relocation machinery, and are currently placed in .text, into .data instead? Ditto vars in .rdata meeting that specification? Then both .text and .rdata can stay non-writable.

You'll dup storage for each process, for all vars-holding-address-needing-reloc, but that is exactly what you should do: all processes need their own copy because the actual relocation address (of the var exported by the DLL) will be different for each process -- because the virtual memory address of the DLL may be different in each process.

--
Chuck




reply via email to

[Prev in Thread] Current Thread [Next in Thread]