[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Results of comparison of {boost, std}::regex, CTRE and PCRE
From: |
Greg Chicares |
Subject: |
Re: [lmi] Results of comparison of {boost, std}::regex, CTRE and PCRE |
Date: |
Wed, 16 Jun 2021 00:25:36 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 |
On 6/10/21 9:36 PM, Vadim Zeitlin wrote:
>
> I think I finally know enough to suggest the best replacement for
> Boost.Regex in test_coding_rules.cpp, so even though I didn't fully finish
> everything yet, I'd like to already summarize my results so far to see if
> you agree with my choice.
TL;DR: Yes, I agree with your choice.
> 0. Original f9276c05f (Rework minimum bounds for solves, 2021-05-27).
> 1. Using std::regex (https://github.com/vadz/lmi/commit/2026e7227).
> 2. Using CTRE (https://github.com/vadz/lmi/commit/67b7cf33c).
> 3. Using PCRE (https://github.com/vadz/lmi/commit/f1a35217b).
>
> +------------------+------+------+------+------+
> | Compiler\Version | 0 | 1 | 2 | 3 |
> |------------------+------+------+------+------+
> | gcc 11 | 10.5 | 17.8 | 1.3 | 1.6 |
> | clang 12 | 6.1 |238 | 24 | 1.7 |
> +------------------+------+------+------+------+
>
> (result for (1) for clang is _not_ a typo).
(clang, 1): They simply lag behind the major improvement seen
between gcc-10 and gcc-11.
(clang, 0) vs. (clang, 2): Looks like they optimize the old
boost code better than gcc, but the CTRE not nearly as well.
(*, 3): Just out of curiosity--did you build PCRE separately
with each compiler, or do both measurements use the same PCRE
binary built with a single compiler?
> As previously mentioned, all the results can be improved by a factor of
> 2-3 by using GNU parallel, here are the timings
[...snip demonstration that you have well characterized...]
> And the last table I'd like to show is the one for compilation time of
> test_coding_rules.o:
>
> +------------------+------+------+------+------+
> | Compiler\Version | 0 | 1 | 2 | 3 |
> |------------------+------+------+------+------+
> | gcc 11 | 4.8 | 8.3 | 42.9 | 4.2 |
> | clang 12 | 4.3 | 5.6 | 33.0 | 3.3 |
> +------------------+------+------+------+------+
I'm sure CTRE will continue to evolve, but overcoming the
tenfold build-time advantage of PCRE is a daunting challenge.
[...even though CTRE runs fastest, with gcc at least...]
> However CTRE has a number of disadvantages too:
[...snip six meaty bullet points...]
> So the only reasonable choices, IMO, are either sticking with std::regex
> and living with its bad performance, maybe by partially compensating it
> with GNU parallel, or switching to PCRE.
I agree: the evidence shows that we should switch to PCRE.
> The latter has a number of advantages:
[...three meaty bullet points...]
> Of course, it does have 2 disadvantages too:
>
> 1. It requires PCRE library.
> 2. It needs a C++ wrapper as using PCRE C API directly is too painful.
>
> For (2), I've written a ~500 line (including a lot of blank lines and
> comments) pcre_regex.hpp header
[...]
> I don't know if you'd like to make this header part of lmi or handle it as
> an external dependency, but in any case I'm ready to maintain it.
I tend to think that, for now at least, we should include it in lmi
directly (with your name on the GPL copyright). Someday you might
want to treat it just like xmlwrapp; in that case, lmi could treat
it as a library, but that doesn't have to be done right now.
> For (1), we can rely on the system package under Linux, but we'd need to
> compile PCRE ourselves under MSW. This is not difficult to do, but, as I
> mentioned before, I'd like to also switch to using PCRE for wxRegEx, and if
> I do this, we could get the compiled PCRE library as a byproduct of
> building wx without doing anything extra in lmi itself.
This is all good. It is probably most expeditious to cause the lmi
makefiles to use only a pc-linux-gnu build of 'test_coding_rules'.
At least on this continent, we're all running only GNU/Linux, and
have no need for an msw binary.
> Solving both of these problems shouldn't take that long, but I'd still
> like to ask for your agreement before continuing to work on it.
I agree.
> To summarize, the current state is:
>
> - I have a branch with ready to be submitted changes with some minor
> improvements to the current version, that could (and should) be merged
> in any case.
Okay, I'm ready for that whenever you like.
> - I have a commit replacing boost::regex with std::regex that is also
> (almost) ready to be submitted -- I just need to test it a bit more.
Well, I believe I did ask for that, but now your data have persuaded
me that we should forsake that line of inquiry and just use PCRE.
> - I also have not quite finished, but already working, changes using PCRE
> that I need to finalize before submitting them.
Sounds great. Thanks.