[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] gcc -flto
From: |
Greg Chicares |
Subject: |
Re: [lmi] gcc -flto |
Date: |
Sat, 24 Dec 2016 22:41:02 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.4.0 |
On 2016-12-24 18:37, Vadim Zeitlin wrote:
> On Sat, 24 Dec 2016 14:14:31 +0000 Greg Chicares <address@hidden> wrote:
>
> GC> On 2016-12-19 15:10, Vadim Zeitlin wrote:
> GC> [...]
> GC> > BTW, another thing that I thought about while discussing this: the
> -flto
> GC> > option also came up and I wrote that it indeed allowed the compiler to
> GC> > compute the result at compile-time in this simple example, but that this
> GC> > wouldn't work in the real program. However now I'm not so sure: if
> you're
> GC> > using pow() just to build the cache of the powers of 10 for not too many
> GC> > exponents, wouldn't gcc be indeed smart enough to precompute all of
> them at
> GC> > compile-time? Of course, lmi doesn't use LTO currently, but perhaps it
> GC> > could be worth testing turning it on and checking how it affects the
> GC> > performance? We can clearly see that it allows for impressive
> optimizations
> GC> > in simple examples and while nothing guarantees that it would be also
> the
> GC> > case in real code, it might be worth trying it out.
> GC>
> GC> It seemed simple enough to try.
> ... snip ...
> GC> and without any "$coefficiency" parallelism, with '-flto' we get:
> GC>
> GC> /opt/lmi/src/lmi[0]$time make system_test
> GC> System test:
> GC> make system_test 119.40s user 17.83s system 93% cpu 2:27.30 total
> GC>
> GC> while without '-flto' it's:
> GC>
> GC> /opt/lmi/src/lmi[0]$time make system_test
> GC> System test:
> GC> make system_test 120.00s user 17.86s system 93% cpu 2:28.21 total
> GC>
> GC> Improvement: (148.21 - 147.30) / 148.21 = six tenths of a percent, which
> GC> doesn't justify significantly slower builds and giving up '-ggdb'. Alas:
> GC> I really hoped to put those idle cores to good use when linking.
>
> Yes, thanks for doing this but the results are very underwhelming. Being
> optimistic, this could indicate that lmi code is already modularized so
> well that there is nothing to be gained by using LTO.
>
> If you're experimenting with these options, I wonder if it might be useful
> to build with -fprofile-generate and then use "make system_test" to
> generate the data to be used with -fprofile-use. Could this perhaps give
> some at least slightly more exciting results?
>
> Probably not, but who knows...
I haven't yet reverted the experimental makefile changes from the earlier
in this thread, so it's easy to try this.
/opt/lmi/src/lmi[0]$make clean
rm --force --recursive /opt/lmi/src/lmi/../build/lmi/Linux/gcc/ship
/opt/lmi/src/lmi[0]$make debug_flag= gprof_flag="-fprofile-generate"
$coefficiency install check_physical_closure >../log 2>&1
/opt/lmi/src/lmi[0]$make system_test
System test:
Now I manually remove everything in the build directly except the
116 '.gcda' files that total 2.7 MB, and...
/opt/lmi/src/lmi[0]$time make gprof_flag="-fprofile-use" $coefficiency install
check_physical_closure >../log 2>&1
make gprof_flag="-fprofile-use" $coefficiency install check_physical_closure
1165.13s user 60.12s system 2309% cpu 53.050 total
/opt/lmi/src/lmi[0]$time make system_test
System test:
make system_test 107.10s user 16.91s system 92% cpu 2:14.13 total
Compared to the result above without any novel optimization:
> GC> make system_test 120.00s user 17.86s system 93% cpu 2:28.21 total
(148.21 - 134.13) / 148.21 = 9.5% faster
That seems worthwhile. I'll try to work out a way to use this for
regular distribution.
- Re: [lmi] MinGW-w64 anomaly?, (continued)
- [lmi] Optimized integral power [Was: MinGW-w64 anomaly?], Greg Chicares, 2016/12/21
- Re: [lmi] Optimized integral power, Vadim Zeitlin, 2016/12/22
- Re: [lmi] Optimized integral power, Greg Chicares, 2016/12/22
- Re: [lmi] Optimized integral power, Vadim Zeitlin, 2016/12/22
- Re: [lmi] Optimized integral power, Greg Chicares, 2016/12/23
- Re: [lmi] libstdc++ anomaly? [was: MinGW-w64 anomaly?], Vadim Zeitlin, 2016/12/19
- [lmi] gcc -flto [Was: libstdc++ anomaly?], Greg Chicares, 2016/12/24
- Re: [lmi] gcc -flto, Vadim Zeitlin, 2016/12/24
- Re: [lmi] gcc -flto,
Greg Chicares <=
- [lmi] gcc -fprofile-generate and -fprofile-use [Was: gcc -flto], Greg Chicares, 2016/12/27
- [lmi] gcc -fprofile-generate and -fprofile-use [Was: gcc -flto], Greg Chicares, 2016/12/27
- Re: [lmi] gcc -fprofile-generate and -fprofile-use [Was: gcc -flto], Vadim Zeitlin, 2016/12/27
- Re: [lmi] gcc -fprofile-generate and -fprofile-use [Was: gcc -flto], Greg Chicares, 2016/12/27