[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Benchmarking: gcc-8 beats gcc-10 soundly?
From: |
Greg Chicares |
Subject: |
Re: [lmi] Benchmarking: gcc-8 beats gcc-10 soundly? |
Date: |
Sat, 19 Sep 2020 23:41:53 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 |
On 2020-09-19 20:37, Greg Chicares wrote:
> On 2020-09-19 15:48, Vadim Zeitlin wrote:
>> On Sat, 19 Sep 2020 15:15:48 +0000 Greg Chicares <gchicares@sbcglobal.net>
>> wrote:
>>
>> GC> It looks like gcc-10 gives us slower lmi binaries. Picking
>> GC> the third '--selftest' scenario as an index of performance
>> GC> (results in microseconds--less is better):
>> GC>
>> GC> gcc-10 gcc-8 ratio
>> GC> ------ ----- -----
>> GC> 102659 84947 1.21 32-bit
>> GC> 50121 37410 1.34 64-bit
>> GC> The fourth scenario is even worse:
>> GC>
>> GC> 33250 20654 1.61 32-bit
>> GC> 24616 13009 1.89 64-bit
>
> With -O3, the 64-bit build performs thus on those two scenarios:
> naic, ee prem solve : 5.001e-02 s mean; 49710 us least of 20 runs
> finra, no solve : 2.483e-02 s mean; 24580 us least of 41 runs
> Thus, the -O3 to -O2 speed ratio is
> 49710 / 50121 = .992
> 24580 / 24616 = .999
> which isn't work the extra build time (82.89 vs 72.76 seconds).
'-O3 -march=native' seems actually worse that '-O3', at least
for 32-bit binaries:
-march above
103175 vs 102659 [worse]
33517 vs 33250 [worse]
Of course that seems counterintuitive: detecting my CPU and
generating optimized code for it has to be better--but only
if the promised improvements are for real. There's something
really wrong here.
Here's how I got the '-O3 -march=native' numbers:
/opt/lmi/src/lmi[0]$git checkout -- workhorse.make
/opt/lmi/src/lmi[0]$sed -i workhorse.make -e's/O2/O3 -march=native/'
/opt/lmi/src/lmi[0]$grep 'O[1-3]' workhorse.make
optimization_flag := -O3 -march=native -fno-omit-frame-pointer
/opt/lmi/src/lmi[0]$env |grep LMI_
LMI_COMPILER=gcc
LMI_TRIPLET=i686-w64-mingw32
/opt/lmi/src/lmi[0]$make clean
rm --force --recursive /opt/lmi/gcc_i686-w64-mingw32/build/ship
/opt/lmi/src/lmi[0]$time make $coefficiency --output-sync=recurse install
check_physical_closure 2>&1 | tee eraseme | less -SN
make $coefficiency --output-sync=recurse install check_physical_closure 2>&1
1814.65s user 80.32s system 41% cpu 1:16:32.36 total
tee eraseme 0.01s user 0.00s system 0% cpu 1:16:33.49 total
less -SN 0.07s user 0.00s system 0% cpu 1:16:44.26 total
/opt/lmi/src/lmi[0]$wine /opt/lmi/bin/lmi_cli_shared.exe --accept
--data_path=/opt/lmi/data --selftest
Test speed:
naic, no solve : 6.830e-02 s mean; 66448 us least of 15 runs
naic, specamt solve : 1.127e-01 s mean; 111507 us least of 9 runs
naic, ee prem solve : 1.041e-01 s mean; 103175 us least of 10 runs
finra, no solve : 3.472e-02 s mean; 33517 us least of 29 runs
finra, specamt solve: 7.767e-02 s mean; 73799 us least of 13 runs
finra, ee prem solve: 7.001e-02 s mean; 69129 us least of 15 runs
>> I've already seen performance regressions in newer g++ versions, but I
>> don't think I've seen anything nearly like 89% slowdown, so it's indeed
>> very astonishing.
>
> I had the thought that perhaps this is a MinGW-w64 snafu, which
> would explain why they haven't officially released anything
> beyond 8.x yet. Yet the bugzilla report doesn't seem to specify
> a platform, while the phoronix link in that report specifies:
> | Ubuntu 20.04 with the Linux 5.8 kernel
>
> I guess I'd better try the flags phoronix tested:
> | "-O3 -march=native", and "-O3 -march=native -flto"
> Right now, lmi looks like the "SciMark" benchmark here:
>
> https://www.phoronix.com/scan.php?page=article&item=gcc-10900k-compiler&num=2
> so maybe this will resolve the anomaly.
Nope.
With LTO, the 'product_files' binary fails, but that's not too
surprising given its historical problems documented in
'workhorse.make'. Here's the first and last of several errors
for the record (but ignore them and skip to the next section):
i686-w64-mingw32-g++ -o product_files.exe alert_cli.o generate_product_files.o
main_common.o main_common_non_wx.o my_db.o my_fund.o my_prod.o my_proem.o
my_rnd.o my_tier.o liblmi.dll -L . -L /opt/lmi/local/gcc_i686-w64-mingw32/lib
-L /opt/lmi/local/gcc_i686-w64-mingw32/bin -lexslt -lxslt -lxml2
-Wl,-Map,product_files.exe.map
/usr/bin/i686-w64-mingw32-ld:
/tmp/product_files.exe.f0GLip.ltrans12.ltrans.o:<artificial>:(.text+0x17dc):
undefined reference to `std::pair<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
>::pair(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > >&&) [clone .lto_priv.0]'
/usr/bin/i686-w64-mingw32-ld:
/tmp/product_files.exe.f0GLip.ltrans12.ltrans.o:<artificial>:(.text+0
x2b64): undefined reference to
`std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<ch
ar const*, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > >, std
::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
> > >
>::vector(std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char
const*, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > >,
std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char
const*, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > > > > > const&) [clone .lto_priv.0]'
collect2: error: ld returned 1 exit status
[Next section] However, 'skeleton.dll' also fails to build with LTO,
and that's where lmi's calculations reside so it's crucial:
i686-w64-mingw32-g++ -o skeleton.dll -shared about_dialog.o alert_wx.o
census_document.o census_vie
w.o database_document.o database_view.o database_view_editor.o default_view.o
docmanager_ex.o file_
command_wx.o gpt_document.o gpt_view.o group_quote_pdf_gen_wx.o icon_monger.o
illustration_document
.o illustration_view.o input_sequence_entry.o main_common.o mec_document.o
mec_view.o msw_workaroun
ds.o multidimgrid_any.o multidimgrid_tools.o mvc_controller.o mvc_view.o
pdf_command_wx.o pdf_write
r_wx.o policy_document.o policy_view.o preferences_view.o previewframe_ex.o
product_editor.o progre
ss_meter_wx.o rounding_document.o rounding_view.o rounding_view_editor.o
single_choice_popup_menu.o skeleton.o system_command_wx.o text_doc.o
text_view.o tier_document.o tier_view.o tier_view_editor.o transferor.o
view_ex.o wx_checks.o wx_table_generator.o wx_utility.o liblmi.dll wx_new.dll
-L . -L /opt/lmi/local/gcc_i686-w64-mingw32/lib -L
/opt/lmi/local/gcc_i686-w64-mingw32/bin -lwxcode_mswu_pdfdoc-3.1 -L
/opt/lmi/local/gcc_i686-w64-mingw32/lib -L
/opt/lmi/local/gcc_i686-w64-mingw32/lib -lwx_mswu-3.1-i686-w64-mingw32
-mwindows -lexslt -lxslt -lxml2 -Wl,-Map,skeleton.dll.map
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
multiple definition of `wxTextCtrlBase::SetValue(wxString const&)';
census_view.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
first defined here
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
multiple definition of `non-virtual thunk to wxTextCtrlBase::SetValue(wxString
const&)'; census_view.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
first defined here
/usr/bin/i686-w64-mingw32-ld: input_sequence_entry.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn868_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
multiple definition of `non-virtual thunk to wxTextCtrlBase::SetValue(wxString
const&)'; census_view.o (symbol from
plugin):(.gnu.linkonce.t._ZN14wxTextCtrlBase8SetValueERK8wxString[__ZThn424_N14wxTextCtrlBase8SetValueERK8wxString]+0x0):
first defined here
collect2: error: ld returned 1 exit status
make[1]: *** [/opt/lmi/src/lmi/workhorse.make:931: skeleton.dll] Error 1
It looks like gcc's LTO is brittle.
Re: [lmi] Benchmarking: gcc-8 beats gcc-10 soundly?, Greg Chicares, 2020/09/20