[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Compiling takes longer with gcc-4.9.2
From: |
Greg Chicares |
Subject: |
Re: [lmi] Compiling takes longer with gcc-4.9.2 |
Date: |
Mon, 04 Jan 2016 02:59:23 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.3.0 |
On 2016-01-03 22:13, Vadim Zeitlin wrote:
> On Thu, 31 Dec 2015 04:57:31 +0000 Greg Chicares <address@hidden> wrote:
>
> GC> On 2015-12-22 00:28, Greg Chicares wrote:
> GC> > My local tree contains makefile changes to use MinGW-w64 gcc-4.9.2,
> GC> > with '-std=c++11', some new warning flags, and various other
> GC> > adjustments.
> GC>
> GC> Hardware change:
> GC> old: dual E5520 ; sata 6Gbps, SSD = samsung 850 pro
> GC> new: dual E5-2630 v3; sata 3Gbps, HDD = wd caviar black WD2003FZEX
>
> Do you plan to move the SSD from the old to the new machine later? It
> would be interesting to see if it affect the results.
I stated the specifications incorrectly. Actually they are:
old: dual E5520 ; sata 3Gbps, HDD = wd caviar black WD2003FZEX
new: dual E5-2630 v3; sata 6Gbps, SSD = samsung 850 pro
> GC> > First I compile wx and wxPdfDoc:
[...]
> I've redone the benchmarks on a Linux machine (this is a notebook with i7
> 4712HQ with 16GiB RAM and 1TB SSD) to compare the relative speed of
> compiling inside the VM and cross-compiling.
Is your VM...a msw-7 guest, in a vmware host?
> wxMSW build Time (s) Size (MB) CPU use (%)
> =========================================================================
> MinGW 3.4.5 341 82
> Cygwin 4.9.2 424 463 637
> MinGW-w64 4.9.1 429 462
> Debian 4.9.1 322 567 629
>
> The expected thing here is that cross-compiling is significantly faster
> than building in the VM, as I was hoping. In fact, for me it is faster than
> using 3.4.5 inside the VM, so it's already a gain. However clearly the
> situation is not the same for you and me as I don't see nearly a 3 times
> slowdown for the in-VM builds in the first place.
Perhaps due to different VMs: mine is a msw-xp guest in a kvm-qemu host.
> A half-surprise is that
> the native compiler is not faster than the Cygwin one on this machine,
> unlike in my previous tests and I'm not sure why is it so
That's so strange that I wonder whether the native and Cygwin compilers
were built with similar options. A wild guess: maybe the native one
targets a more conservative architecture, like i586.
> But this is just the beginning, not the end, of our benchmarking story.
> As you remember, I was surprised by the lack of usefulness of precompiled
> headers when cross-compiling lmi. So let's see if they help when building
> wxWidgets itself, i.e. configure it using extra --disable-precomp-headers
> options. Here are the numbers:
>
> wxMSW no PCH Time (s) Size (MB) CPU use (%)
> =========================================================================
> Cygwin 4.9.2 396 134 711
> MinGW-w64 4.9.1 355 134
> Debian 4.9.1 269 136 737
>
> This was a huge surprise to me as I didn't expect to have such big gains
> from _disabling_ the PCH. But we gain ~7%, 17% and 16% respectively from
> just doing this. So, for me, simply disabling the precompiled headers
> brings 4.9.1 almost in line with 3.4.5 (there is just a 4% difference which
> is really not much especially considering that we're speaking of -O2 builds
> and that 4.9 should optimize much, much better than 3.4.5 -- and it would
> be interesting to run lmi benchmarks to check how much exactly later) when
> building inside the VM and cross-compiling is more than 20% faster than
> using the old compiler. The only regression is in the size of the build
> files, but notice that the size of the DLLs produced is roughly the same,
> so it's not really a problem.
By "The only regression", I take it that you mean:
MinGW 3.4.5 341s 82MB
Debian 4.9.1 269s 136MB <-- 54 MB more
but, as you say, that's not a problem at all: I'd gladly trade 54MB for
a one-minute improvment in build time.
> Still, the most shocking discovery was that PCH had such a huge negative
> effect in the first place. At this stage I was seriously doubting my
> sanity, so I decided to rebuild wxGTK to check if it's affected by the PCH
> in the same way. And the answer was emphatically not:
>
> wxGTK Time (s) Size (MB) CPU use (%)
> =========================================================================
> default 121 902 759
> no PCH 184 73 771
>
> There is certainly a huge space penalty for using PCH, but it is also 33%
> faster (which is still less than I thought but at least is positive). So,
I anticipate that I'll be building in tmpfs, so I guess I don't need to
care too much about the size of the build directory. But still, it's more
than ten times as big with PCH. Does that suggest that gcc's implementation
of PCH isn't very good, at least with C++? Does msvc have such enormous
disk overhead with PCH?
> after thinking about this for a bit, I realized that the difference could
> be due to using --enable-monolithic for lmi but not for the default builds,
> so I decided to try without it, using 4.9.1 cross-compiler (as it's the
> fastest):
>
> wxMSW multilib Time (s) Size (MB) CPU use (%)
> =========================================================================
> default 210 1434 729
> no PCH 240 100 772
>
> As you can see, there is still a huge space difference, but at least now
> the PCH build is ~12% faster. Perhaps more importantly, the multilib build
> is also ~11% faster than monolithic one without PCH. So if you want to have
> faster builds, it could be worth stopping to use the monolithic library.
IIRC, we chose 'monolithic' to solve some awful problem. Let's see...here's
the earliest reference on the mailing list:
http://lists.nongnu.org/archive/html/lmi/2005-08/msg00008.html
It had to do with passing exceptions across dll boundaries.
I guess libstdc++ is a shared library now, so perhaps that problem no longer
exists. However:
- '--enable-monolithic' is a deep and ancient habit and I hesitate to
break it;
- I might do an incremental 'make' of lmi dozens of times a day, but
I don't build wx as often as once a month;
so I'm not inclined to change this.
> Notice that it would also be worth using --without-opengl configure option
> in either case as lmi doesn't wxGLCanvas, but this is a small gain.
But that's a small, safe change, so it's probably worthwhile.
> However notice that all these numbers are for the default compiler C++
> dialect which is still C++03 for g++ 4.9 (it has changed to C++11 since
> 5.0). Adding CXXFLAGS=-std=c++11 changes the numbers for all the builds and
> not in the good sense. To give an idea of it, here are the results for
> wxGTK:
>
> wxGTK C++11 Time (s) Size (MB) CPU use (%)
> =========================================================================
> default 176 1210 769
> no PCH 267 73 774
That accords with my findings: apparently C++11 is a much more expensive
language to compile than C++03, at least with gcc.
And perhaps gcc's PCH implementation is tuned for C, not C++? I see that
it's noticeably faster, but it takes...seventeen times as much space?
I'm leery of this. What if it needs a hundred times as much space with
the next C++ dialect? Anyway, I think the statistics you posted sometime
in the last month suggest that PCH doesn't help when building lmi.
> So C++11 support doesn't come for free, in build time terms. I'd still
> like to enable it because it brings important benefits in terms of
> development time, i.e. productivity.
I agree. C++98 was slower to compile than C99, too. Computers keep getting
faster; I don't.
> And at least in relative terms, using
> 4.9.1 as C++11 cross-compiler is only 15s slower than using 3.4.5 inside
> the VM, so I'm pretty confident it will still be faster for you on your new
> machine than on the old one and hence I hope that we can still start using
> it.
Yes, let's focus on that.
> GC> > Now I do a complete rebuild of lmi, which I very recently measured
> GC> > with mingw.org's native gcc-3.4.5 as follows:
[gcc-4.9.2 with c++11 slows that down more than new hardware speeds it up]
> I think cross-compiling should be even more beneficial for you because you
> should be able to use -j16 without problems then.
I hope it'll scale well to thirty-two. If each gcc instance uses 300MB RAM,
that's 10GB, leaving plenty for a RAM disk.
> GC> > I can't figure out why the best result comes from '--jobs=4'. If the
> GC> > number of CPUs isn't the bottleneck, what is? disk? RAM? CPU cache?
> GC>
> GC> The hardware comparison suggests to me that it's not disk-bound or
> GC> CPU-bound. I guess it's just the 32-bit guest OS.
>
> FWIW for me it is CPU-bound. All 8 (logical) CPUs are pegged during the
> compilation and 1 of them remains at 100% use throughout the configure and
> linking steps too.
Perhaps I can eventually use '-fuse-ld=gold' and do most of my testing
with a wxGTK build. And sometime in the future we might try to find a
way to make msw builds link faster.
> The slightly worse news is that cross-compiling all the dependencies is a
> bit tricky and I ran into several problems doing it. It's supposed to be as
> simple as just using --host=i686-w64-mingw32 but in practice there are some
> bugs preventing this from working and I'm going to post some notes about
> what I did to avoid them a bit later (unless you tell me you don't need
> them because you've already figured all this out while I was running the
> endless build benchmarks).
>
> Please let me know how would you like to proceed and what else do you
> think it could be interesting to do.
I haven't even tried cross-compiling lmi yet. It would be enormously helpful
to me if you could figure out how to make that work.
Re: [lmi] Compiling takes longer with gcc-4.9.2, Greg Chicares, 2016/01/18