lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Continuing deboostification with removing dependency on Boost.


From: Vadim Zeitlin
Subject: Re: [lmi] Continuing deboostification with removing dependency on Boost.Regex
Date: Sat, 29 May 2021 01:36:11 +0200

On Fri, 28 May 2021 20:37:39 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> On 5/28/21 4:20 PM, Vadim Zeitlin wrote:
GC> > On Fri, 28 May 2021 14:14:16 +0000 Greg Chicares 
<gchicares@sbcglobal.net> wrote:
[...]
GC> > GC> Does 'test_coding_rules' become fast enough if we rewrite
GC> > GC> at least some of its expressions using this library?
GC> > 
GC> >  It almost probably will, yes.
GC> 
GC> If you would like to carry out the p1433r0-ization, I'll certainly
GC> prioritize reviewing it.

 I will, thanks for confirming this.

GC> >  Does this mean that we shouldn't even attempt parallelizing the checks?
GC> > It should be an almost guaranteed order of magnitude factor win (because
GC> > modern systems have O(10) CPUs), so I'd strongly consider doing it
GC> > independently of anything else. Do you object to this?
GC> 
GC> Interesting question. In principle, of course it's a great idea,
GC> and I compile code in parallel all the time. In practice, the
GC> difficulty [or so I thought] is that "external" parallelization:
GC>   test_coding_rules$(EXEEXT) src/*
GC> would lose the summary statistics that the program gathers and
GC> prints, which I'd much rather preserve; so it would need to be
GC> parallelized "internally", e.g., via threads. But this is the
GC> twenty-first century, and we have threads in the standard library.

 Yes, and this looks like a task well suited for what limited support for
them the standard library provides, as we never need to cancel them or wait
for anything else than all of them terminating.

GC> My only objection would be that this takes effort to code. However...
GC> 
GC> ...stepping back and taking a fresh look, those statistics are:
GC>       692 source files
GC>    196647 source lines
GC>       277 marked defects
GC> which could be generated by a tiny regex-free C++ program (or a
GC> couple of shell commands),

 Yes, I really wouldn't bother with writing a separate program for this, as
grep can give us all we need here.

GC> so maybe I was mistaken, and we could
GC> use the builtin parallelism of 'gnu make -jN' to run a new
GC> 'test_coding_rules_without_summary_statistics' against an arbitrary
GC> list of files independently, without threading.

 But how would you share the list of files among all copies of this
program? I don't know how to do it from make and I don't think it can be
done optimally from outside the program (inside the program we'd have a
simple work queue that would keep as many threads as we have as fully
occupied as possible).

GC> So the only hesitation I had evaporates, and the suggestion becomes
GC> compelling.

 Please correct me if I'm wrong, but you still seem to be against actually
changing test_coding_rules to use threads, and I don't know how to do it
otherwise, so this still needs to be clarified, ideally.

 But any parallelization will only come later, i.e. my order of sub-tasks
here is:

1. Rebase std::regex patch on master and benchmark it.
2. Try using CTRE.
3. Parallelize test_coding_rules.

 I think this makes sense because if you're interested in using CTRE
anyhow, there is no need to try (3) before (2) to see if it's going to be
fast enough -- we don't know by how much exactly, but we can be quite sure
that CTRE will be still faster anyhow, and it might even be fast enough to
not require (3) in practice. But please correct me if I'm wrong.

 Thanks,
VZ

Attachment: pgpZCrObtSSJk.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]