Re: [lmi] Request for review

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Request for review

From:	Greg Chicares
Subject:	Re: [lmi] Request for review
Date:	Thu, 12 Jan 2017 10:14:11 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.5.1

On 2017-01-11 23:32, Vadim Zeitlin wrote:
> On Wed, 11 Jan 2017 23:23:56 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> I'm trying to figure out a solid universal rule for this modernization.
> GC> What do you think about transforming:
> GC> 
> GC>   for(const_iterator i...)   -->   for(auto const& k: ...)
> GC>   for(      iterator i...)   -->   for(auto      & k: ...)
> GC> 
> GC> That seems simple and logical, to me at least.

Notably, both 'auto' variables are references; there's no plain 'auto' above.

>  Yes, it does to me too, and this is what I basically used for the changes of
> https://github.com/vadz/lmi/pull/52
> 
> GC> But it doesn't quite fit our practice so far: we have ranged for-loops
> GC> only in 'rate_table*', and...
> GC> 
> GC> $grep --no-filename 'for(.*: ' *.?pp |sed -e's/^ *//' |sort |less -S
> GC> for(auto const& i: index_)
> GC> for(auto const& j: table_names)
> GC> for(auto const& not_field: known_not_fields)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values_)
> GC> for(auto& e: index_by_number_)
> GC> for(auto& v: values_)
> GC> for(soa_field const& f: soa_fields)
> GC> 
> GC> Skipping the last line because it lacks 'auto'
> 
>  No idea why, to be honest. I think it should use "auto" too...

But consider the
  https://github.com/vadz/lmi/pull/52
changes to insert_spaces_between_words() in 'census_view.cpp':

[clang-tidy changes]
https://github.com/vadz/lmi/commit/d1fd3ae2e222c6d7a9edd1c9124ef2935e2918a5.patch
-    std::string::const_iterator i;
-    for(i = s.begin(); i != s.end(); ++i)
+    for(auto c: s)

[your refinements]
https://github.com/vadz/lmi/commit/6afb131f2d5d16aadb3b00dfc852d0cfba6949b6.patch
-    for(auto c: s)
+    for(char c: s)

There, clang-tidy used 'auto', as it does almost all the time. I've saved those
two patches as '52a.patch' and '52b.patch' respectively; analyzing the clang 
one:

/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch 
|wc -l
118
/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch 
|sed -e'/for(auto/d' |wc -l
6

we see that clang-tidy uses 'auto' in 112 of 118 cases, and builtin types
{char, int, double} in the other 6. For '52b.patch' [your refinements],
'auto' is used in 24/32 cases; in the other 8, you use {fs::path const&,
char&, char, int, double}. And 8/32 is much greater than 6/112.

Focusing on the insert_spaces_between_words() change:

     std::string::const_iterator i;         <-- original
     for(i = s.begin(); i != s.end(); ++i)  <-- original
-    for(auto c: s)  <-- clang-tidy
+    for(char c: s)  <-- refinement

I think your refinement may be preferable at least in the sense that
I don't have to wonder whether it's okay for it not to be a reference:

-    for(auto c: s)  <-- why not 'auto&' or 'auto const&'?
+    for(char c: s)  <-- obviously shouldn't be 'char&' or 'char const&'

Alternatively, I wouldn't have to wonder about that if it were written:

+    for(auto const& c: s)

IOW, perhaps that the variable should have a specific non-reference
builtin type when the correct type is obvious, and otherwise a reference
type 'auto' (const or non-const as the case may be).

Maybe that's an ill-chosen example because it's not directly obvious
that std::string::const_iterator is char, so consider this clang-tidy
change instead:

-    for
-        (std::vector<double>::const_iterator vi = v.begin()
-        ;vi != v.end()
-        ;++vi
-        )
+    for(double vi : v)

I support your whitespace change to it:

-    for(double vi : v)
+    for(double vi: v)

You didn't change double to auto, and maybe I wouldn't either.
OTOH, this conflicts with the general advice to write 'auto' wherever
possible, so let me propose a choice of two rules and ask which you
prefer:

Rule 1:
- for(std::vector<T>::iterator ...
+ for(T& ...
- for(std::vector<T>::const_iterator ...
+ for(T ...         <-- if T is a builtin type
+ for(T const& ...  <-- otherwise

Pro: at time of modernization, we often know 'T'
Pro: plain T is obviously optimal for builtin types
Con: not modern because 'auto' could be used
Con: brittle: if the type of the container changes from
     vector<T> to vector<XYZ>, we have to change the for-loop

Rule 2:
- for(std::vector<T>::iterator ...
+ for(auto& ...
- for(std::vector<T>::const_iterator ...
+ for(auto const& ...

Pro: simpler
Pro: maximal use of auto
Pro: if container changes from vector<T> to map<ABC,XYZ>,
     the auto references are guaranteed to remain correct:
     the ranged for-loop is self-maintaining
Con: 'auto const&' might not be optimized ideally when the
     underlying type is a builtin type

Having summarized it that way, I now think I might prefer Rule 2.

> GC> I count:
> GC>  - 3  auto const&
> GC>  - 2  auto&
> GC>  - 7  auto [neither const nor &]
> GC> What was your rule for choosing plain 'auto'? I'm guessing you used
> GC> it in cases where 'auto const&' would have been just as good, but
> GC> where you knew that using a copy would involve no more overhead than
> GC> using a const reference?
> 
>  Exactly, I used plain "auto" for "int" or "double" as it seemed weird to
> work with them by (const) reference. For the changes of the PR 52 I chose a
> slightly different tack and used "auto&" or "auto const&" in general and
> the primitive type (and not "auto") for this case.

I tend to think your "different tack" was an improvement. IOW:

Rule 3 (not recommended):
- for(std::vector<T>::const_iterator ...
+ for(auto ...         <-- if T is a builtin type

That rule says that if we know at time of migration that T is a builtin
type, then write 'for(auto x: ...' with a non-reference auto, which we
know is okay because of our knowledge of T...and then erase that knowledge
of T from the for-loop. If, later, we substitute a heavier type for T,
then we're unlikely to notice that the for-loop now makes unnecessary
copies, because a ranged for-loop with auto seems to say "the compiler
guarantees I'm always perfect and never need maintenance".

>  Reviewing the changes now I see that I wasn't 100% consistent,
> unfortunately, e.g. I used "auto" instead of "wxWindowID" (which is just an
> int) in one place and also instead of "unsigned char" in another one and
> instead of "wxWindow*" a few more times. I probably should fix this...

With an optimal rule applied uniformly, we'll never have to reconsider
anything. Otherwise, we'll forever be second-guessing ourselves and
making little tweaks, which never will converge to 100% consistency.

> GC> Let me ask that another way: if I were hypothetically to propose
> GC> changing all those naked auto's to 'auto const&', what objection
> GC> would you have, if any?
> 
>  No real objections, but it looks like a premature pessimization to me.
> It's quite possible that modern compilers will manage to optimize it and
> produce the same code even so (I could test this...), but just not using
> "const int&" when a plain "int" would do seems more natural to me.

Let me accept your offer to test this. If you find that 'gcc -O2'
(and any other good compilers you might choose to consider) optimize
that well enough, then that removes the only drawback to Rule 2,
making it the best choice AFAICS.

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] Request for review, Greg Chicares, 2017/01/11
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/11
  - Re: [lmi] Request for review, Greg Chicares, 2017/01/11
    - Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/11
    - Re: [lmi] Request for review, Greg Chicares <=
    - Re: [lmi] Request for review, Greg Chicares, 2017/01/12
    - Re: [lmi] Request for review, Greg Chicares, 2017/01/12
    - Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/12
    - Re: [lmi] Request for review, Greg Chicares, 2017/01/12
    - Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/12
    - Re: [lmi] Request for review, Greg Chicares, 2017/01/12
    - Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/13
- [lmi] Request for review, Greg Chicares, 2017/01/15
  - Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/15
    - Re: [lmi] Request for review, Greg Chicares, 2017/01/16

Prev by Date: Re: [lmi] Request for review
Next by Date: Re: [lmi] [PATCH] C++ m11n: range-based for loops
Previous by thread: Re: [lmi] Request for review
Next by thread: Re: [lmi] Request for review
Index(es):
- Date
- Thread