lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Request for review


From: Greg Chicares
Subject: Re: [lmi] Request for review
Date: Thu, 12 Jan 2017 10:14:11 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.5.1

On 2017-01-11 23:32, Vadim Zeitlin wrote:
> On Wed, 11 Jan 2017 23:23:56 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> I'm trying to figure out a solid universal rule for this modernization.
> GC> What do you think about transforming:
> GC> 
> GC>   for(const_iterator i...)   -->   for(auto const& k: ...)
> GC>   for(      iterator i...)   -->   for(auto      & k: ...)
> GC> 
> GC> That seems simple and logical, to me at least.

Notably, both 'auto' variables are references; there's no plain 'auto' above.

>  Yes, it does to me too, and this is what I basically used for the changes of
> https://github.com/vadz/lmi/pull/52
> 
> GC> But it doesn't quite fit our practice so far: we have ranged for-loops
> GC> only in 'rate_table*', and...
> GC> 
> GC> $grep --no-filename 'for(.*: ' *.?pp |sed -e's/^ *//' |sort |less -S
> GC> for(auto const& i: index_)
> GC> for(auto const& j: table_names)
> GC> for(auto const& not_field: known_not_fields)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values_)
> GC> for(auto& e: index_by_number_)
> GC> for(auto& v: values_)
> GC> for(soa_field const& f: soa_fields)
> GC> 
> GC> Skipping the last line because it lacks 'auto'
> 
>  No idea why, to be honest. I think it should use "auto" too...

But consider the
  https://github.com/vadz/lmi/pull/52
changes to insert_spaces_between_words() in 'census_view.cpp':

[clang-tidy changes]
https://github.com/vadz/lmi/commit/d1fd3ae2e222c6d7a9edd1c9124ef2935e2918a5.patch
-    std::string::const_iterator i;
-    for(i = s.begin(); i != s.end(); ++i)
+    for(auto c: s)

[your refinements]
https://github.com/vadz/lmi/commit/6afb131f2d5d16aadb3b00dfc852d0cfba6949b6.patch
-    for(auto c: s)
+    for(char c: s)

There, clang-tidy used 'auto', as it does almost all the time. I've saved those
two patches as '52a.patch' and '52b.patch' respectively; analyzing the clang 
one:

/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch 
|wc -l
118
/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch 
|sed -e'/for(auto/d' |wc -l
6

we see that clang-tidy uses 'auto' in 112 of 118 cases, and builtin types
{char, int, double} in the other 6. For '52b.patch' [your refinements],
'auto' is used in 24/32 cases; in the other 8, you use {fs::path const&,
char&, char, int, double}. And 8/32 is much greater than 6/112.

Focusing on the insert_spaces_between_words() change:

     std::string::const_iterator i;         <-- original
     for(i = s.begin(); i != s.end(); ++i)  <-- original
-    for(auto c: s)  <-- clang-tidy
+    for(char c: s)  <-- refinement

I think your refinement may be preferable at least in the sense that
I don't have to wonder whether it's okay for it not to be a reference:

-    for(auto c: s)  <-- why not 'auto&' or 'auto const&'?
+    for(char c: s)  <-- obviously shouldn't be 'char&' or 'char const&'

Alternatively, I wouldn't have to wonder about that if it were written:

+    for(auto const& c: s)

IOW, perhaps that the variable should have a specific non-reference
builtin type when the correct type is obvious, and otherwise a reference
type 'auto' (const or non-const as the case may be).

Maybe that's an ill-chosen example because it's not directly obvious
that std::string::const_iterator is char, so consider this clang-tidy
change instead:

-    for
-        (std::vector<double>::const_iterator vi = v.begin()
-        ;vi != v.end()
-        ;++vi
-        )
+    for(double vi : v)

I support your whitespace change to it:

-    for(double vi : v)
+    for(double vi: v)

You didn't change double to auto, and maybe I wouldn't either.
OTOH, this conflicts with the general advice to write 'auto' wherever
possible, so let me propose a choice of two rules and ask which you
prefer:

Rule 1:
- for(std::vector<T>::iterator ...
+ for(T& ...
- for(std::vector<T>::const_iterator ...
+ for(T ...         <-- if T is a builtin type
+ for(T const& ...  <-- otherwise

Pro: at time of modernization, we often know 'T'
Pro: plain T is obviously optimal for builtin types
Con: not modern because 'auto' could be used
Con: brittle: if the type of the container changes from
     vector<T> to vector<XYZ>, we have to change the for-loop

Rule 2:
- for(std::vector<T>::iterator ...
+ for(auto& ...
- for(std::vector<T>::const_iterator ...
+ for(auto const& ...

Pro: simpler
Pro: maximal use of auto
Pro: if container changes from vector<T> to map<ABC,XYZ>,
     the auto references are guaranteed to remain correct:
     the ranged for-loop is self-maintaining
Con: 'auto const&' might not be optimized ideally when the
     underlying type is a builtin type

Having summarized it that way, I now think I might prefer Rule 2.

> GC> I count:
> GC>  - 3  auto const&
> GC>  - 2  auto&
> GC>  - 7  auto [neither const nor &]
> GC> What was your rule for choosing plain 'auto'? I'm guessing you used
> GC> it in cases where 'auto const&' would have been just as good, but
> GC> where you knew that using a copy would involve no more overhead than
> GC> using a const reference?
> 
>  Exactly, I used plain "auto" for "int" or "double" as it seemed weird to
> work with them by (const) reference. For the changes of the PR 52 I chose a
> slightly different tack and used "auto&" or "auto const&" in general and
> the primitive type (and not "auto") for this case.

I tend to think your "different tack" was an improvement. IOW:

Rule 3 (not recommended):
- for(std::vector<T>::const_iterator ...
+ for(auto ...         <-- if T is a builtin type

That rule says that if we know at time of migration that T is a builtin
type, then write 'for(auto x: ...' with a non-reference auto, which we
know is okay because of our knowledge of T...and then erase that knowledge
of T from the for-loop. If, later, we substitute a heavier type for T,
then we're unlikely to notice that the for-loop now makes unnecessary
copies, because a ranged for-loop with auto seems to say "the compiler
guarantees I'm always perfect and never need maintenance".

>  Reviewing the changes now I see that I wasn't 100% consistent,
> unfortunately, e.g. I used "auto" instead of "wxWindowID" (which is just an
> int) in one place and also instead of "unsigned char" in another one and
> instead of "wxWindow*" a few more times. I probably should fix this...

With an optimal rule applied uniformly, we'll never have to reconsider
anything. Otherwise, we'll forever be second-guessing ourselves and
making little tweaks, which never will converge to 100% consistency.

> GC> Let me ask that another way: if I were hypothetically to propose
> GC> changing all those naked auto's to 'auto const&', what objection
> GC> would you have, if any?
> 
>  No real objections, but it looks like a premature pessimization to me.
> It's quite possible that modern compilers will manage to optimize it and
> produce the same code even so (I could test this...), but just not using
> "const int&" when a plain "int" would do seems more natural to me.

Let me accept your offer to test this. If you find that 'gcc -O2'
(and any other good compilers you might choose to consider) optimize
that well enough, then that removes the only drawback to Rule 2,
making it the best choice AFAICS.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]