[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Horrible std::regex performance
From: |
Greg Chicares |
Subject: |
Re: [lmi] Horrible std::regex performance |
Date: |
Fri, 15 Jul 2016 23:41:03 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.8.0 |
On 2016-07-15 21:54, Vadim Zeitlin wrote:
> On Mon, 11 Jul 2016 22:18:23 +0000 Greg Chicares <address@hidden> wrote:
[...]
> GC> Even if this makefile line:
> GC> @-$(TEST_CODING_RULES) *
> GC> is a bottleneck, perhaps it could be parallelized.
>
> Before starting doing this at C++ level, notice that without any changes
> to the code, GNU parallel can be used to parallelize the execution at the
> shell level, e.g.
>
> % git ls-files|fgrep -v /|parallel ./test_coding_rules > /dev/null
>
> (fgrep is used to avoid checking the files in subdirectories as "*" above
> doesn't do it and ">/dev/null" is used to avoid tons of summary output).
> Doing this brings down the time to 2.1s for the current version and 9.0s
> for the std::regex one under Linux. Unsurprisingly, it doesn't really help
> with the MSW boost::regex version running under WINE as it's already
> blazingly fast and the extra process startup overhead makes it only slower.
> It does help with the std::regex version under WINE, but it's still much
> slower at ~30s. Unfortunately I don't see comparable speed up when using
> parallel under Cygwin, the best I can get is 10% improvement which is not
> really that significant.
Okay, that idea--running 'test_coding_rules' in parallel--doesn't seem
good enough. Thirty seconds under WINE is better than ninety-three, but
much worse than one point two:
> Platform Compiler boost::regex std::regex PCRE
> ----------------------------------------------------------------------------
> Linux gcc 4.9 5.9 27.7 1.9
> WINE gcc 4.9 1.2 93.5
I really have to stop using this Cygwin VM and move everything to debian.
I'm not sure whether I'll want to run this program in WINE or in native
GNU/Linux, but either way the performance with std::regex is too poor.
Besides, I'm not sure GNU parallel will do the right thing. I think
that would require...
http://lists.nongnu.org/archive/html/lmi/2016-07/msg00011.html
| restructuring 'test_coding_rules.cpp', notably because it summarizes
| statistics for all files, so it might be more attractive to parallelize
| that file itself, e.g. by threading.
But I can't readily see that for myself, because I don't have GNU
parallel installed in Cygwin here, and I really don't want to upgrade
Cygwin at this moment. Please tell me if I'm missing something and
GNU parallel would magically get the summary statistics right.
What does the PCRE measurement above mean? Did you translate
'test_coding_rules.cpp' to use the PCRE C API instead of C++ regex?