[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lmi] Request for review
From: |
Greg Chicares |
Subject: |
Re: [lmi] Request for review |
Date: |
Thu, 12 Jan 2017 10:14:11 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.5.1 |
On 2017-01-11 23:32, Vadim Zeitlin wrote:
> On Wed, 11 Jan 2017 23:23:56 +0000 Greg Chicares <address@hidden> wrote:
>
> GC> I'm trying to figure out a solid universal rule for this modernization.
> GC> What do you think about transforming:
> GC>
> GC> for(const_iterator i...) --> for(auto const& k: ...)
> GC> for( iterator i...) --> for(auto & k: ...)
> GC>
> GC> That seems simple and logical, to me at least.
Notably, both 'auto' variables are references; there's no plain 'auto' above.
> Yes, it does to me too, and this is what I basically used for the changes of
> https://github.com/vadz/lmi/pull/52
>
> GC> But it doesn't quite fit our practice so far: we have ranged for-loops
> GC> only in 'rate_table*', and...
> GC>
> GC> $grep --no-filename 'for(.*: ' *.?pp |sed -e's/^ *//' |sort |less -S
> GC> for(auto const& i: index_)
> GC> for(auto const& j: table_names)
> GC> for(auto const& not_field: known_not_fields)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto num: numbers)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values)
> GC> for(auto v: values_)
> GC> for(auto& e: index_by_number_)
> GC> for(auto& v: values_)
> GC> for(soa_field const& f: soa_fields)
> GC>
> GC> Skipping the last line because it lacks 'auto'
>
> No idea why, to be honest. I think it should use "auto" too...
But consider the
https://github.com/vadz/lmi/pull/52
changes to insert_spaces_between_words() in 'census_view.cpp':
[clang-tidy changes]
https://github.com/vadz/lmi/commit/d1fd3ae2e222c6d7a9edd1c9124ef2935e2918a5.patch
- std::string::const_iterator i;
- for(i = s.begin(); i != s.end(); ++i)
+ for(auto c: s)
[your refinements]
https://github.com/vadz/lmi/commit/6afb131f2d5d16aadb3b00dfc852d0cfba6949b6.patch
- for(auto c: s)
+ for(char c: s)
There, clang-tidy used 'auto', as it does almost all the time. I've saved those
two patches as '52a.patch' and '52b.patch' respectively; analyzing the clang
one:
/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch
|wc -l
118
/srv/chroot/cross-lmi/opt/lmi/src/lmi[0]$grep '[+] *for(' a00/patches/52a.patch
|sed -e'/for(auto/d' |wc -l
6
we see that clang-tidy uses 'auto' in 112 of 118 cases, and builtin types
{char, int, double} in the other 6. For '52b.patch' [your refinements],
'auto' is used in 24/32 cases; in the other 8, you use {fs::path const&,
char&, char, int, double}. And 8/32 is much greater than 6/112.
Focusing on the insert_spaces_between_words() change:
std::string::const_iterator i; <-- original
for(i = s.begin(); i != s.end(); ++i) <-- original
- for(auto c: s) <-- clang-tidy
+ for(char c: s) <-- refinement
I think your refinement may be preferable at least in the sense that
I don't have to wonder whether it's okay for it not to be a reference:
- for(auto c: s) <-- why not 'auto&' or 'auto const&'?
+ for(char c: s) <-- obviously shouldn't be 'char&' or 'char const&'
Alternatively, I wouldn't have to wonder about that if it were written:
+ for(auto const& c: s)
IOW, perhaps that the variable should have a specific non-reference
builtin type when the correct type is obvious, and otherwise a reference
type 'auto' (const or non-const as the case may be).
Maybe that's an ill-chosen example because it's not directly obvious
that std::string::const_iterator is char, so consider this clang-tidy
change instead:
- for
- (std::vector<double>::const_iterator vi = v.begin()
- ;vi != v.end()
- ;++vi
- )
+ for(double vi : v)
I support your whitespace change to it:
- for(double vi : v)
+ for(double vi: v)
You didn't change double to auto, and maybe I wouldn't either.
OTOH, this conflicts with the general advice to write 'auto' wherever
possible, so let me propose a choice of two rules and ask which you
prefer:
Rule 1:
- for(std::vector<T>::iterator ...
+ for(T& ...
- for(std::vector<T>::const_iterator ...
+ for(T ... <-- if T is a builtin type
+ for(T const& ... <-- otherwise
Pro: at time of modernization, we often know 'T'
Pro: plain T is obviously optimal for builtin types
Con: not modern because 'auto' could be used
Con: brittle: if the type of the container changes from
vector<T> to vector<XYZ>, we have to change the for-loop
Rule 2:
- for(std::vector<T>::iterator ...
+ for(auto& ...
- for(std::vector<T>::const_iterator ...
+ for(auto const& ...
Pro: simpler
Pro: maximal use of auto
Pro: if container changes from vector<T> to map<ABC,XYZ>,
the auto references are guaranteed to remain correct:
the ranged for-loop is self-maintaining
Con: 'auto const&' might not be optimized ideally when the
underlying type is a builtin type
Having summarized it that way, I now think I might prefer Rule 2.
> GC> I count:
> GC> - 3 auto const&
> GC> - 2 auto&
> GC> - 7 auto [neither const nor &]
> GC> What was your rule for choosing plain 'auto'? I'm guessing you used
> GC> it in cases where 'auto const&' would have been just as good, but
> GC> where you knew that using a copy would involve no more overhead than
> GC> using a const reference?
>
> Exactly, I used plain "auto" for "int" or "double" as it seemed weird to
> work with them by (const) reference. For the changes of the PR 52 I chose a
> slightly different tack and used "auto&" or "auto const&" in general and
> the primitive type (and not "auto") for this case.
I tend to think your "different tack" was an improvement. IOW:
Rule 3 (not recommended):
- for(std::vector<T>::const_iterator ...
+ for(auto ... <-- if T is a builtin type
That rule says that if we know at time of migration that T is a builtin
type, then write 'for(auto x: ...' with a non-reference auto, which we
know is okay because of our knowledge of T...and then erase that knowledge
of T from the for-loop. If, later, we substitute a heavier type for T,
then we're unlikely to notice that the for-loop now makes unnecessary
copies, because a ranged for-loop with auto seems to say "the compiler
guarantees I'm always perfect and never need maintenance".
> Reviewing the changes now I see that I wasn't 100% consistent,
> unfortunately, e.g. I used "auto" instead of "wxWindowID" (which is just an
> int) in one place and also instead of "unsigned char" in another one and
> instead of "wxWindow*" a few more times. I probably should fix this...
With an optimal rule applied uniformly, we'll never have to reconsider
anything. Otherwise, we'll forever be second-guessing ourselves and
making little tweaks, which never will converge to 100% consistency.
> GC> Let me ask that another way: if I were hypothetically to propose
> GC> changing all those naked auto's to 'auto const&', what objection
> GC> would you have, if any?
>
> No real objections, but it looks like a premature pessimization to me.
> It's quite possible that modern compilers will manage to optimize it and
> produce the same code even so (I could test this...), but just not using
> "const int&" when a plain "int" would do seems more natural to me.
Let me accept your offer to test this. If you find that 'gcc -O2'
(and any other good compilers you might choose to consider) optimize
that well enough, then that removes the only drawback to Rule 2,
making it the best choice AFAICS.
- [lmi] Request for review, Greg Chicares, 2017/01/11
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/11
- Re: [lmi] Request for review, Greg Chicares, 2017/01/11
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/11
- Re: [lmi] Request for review,
Greg Chicares <=
- Re: [lmi] Request for review, Greg Chicares, 2017/01/12
- Re: [lmi] Request for review, Greg Chicares, 2017/01/12
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/12
- Re: [lmi] Request for review, Greg Chicares, 2017/01/12
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/12
- Re: [lmi] Request for review, Greg Chicares, 2017/01/12
- Re: [lmi] Request for review, Vadim Zeitlin, 2017/01/13