lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] static constexpr


From: Vadim Zeitlin
Subject: Re: [lmi] static constexpr
Date: Thu, 17 Jun 2021 13:24:51 +0200

On Wed, 16 Jun 2021 22:20:45 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> On 6/16/21 4:42 PM, Vadim Zeitlin wrote:
GC> > On Wed, 16 Jun 2021 14:49:24 +0000 Greg Chicares 
<gchicares@sbcglobal.net> wrote:
GC> [...]
GC> > rint() is not constexpr in C++20 (this might change in
GC> > C++23, but doesn't help right now) and I don't know how to replace it with
GC> > something that could be evaluated at compile-time.
GC> 
GC> Yup. I think we should just wait for it to become constexpr in a
GC> future version. In 'round_to.hpp', IIRC, we once had a simulated
GC> rint(), but I would be disinclined to disinter it even if it could
GC> be made constexpr (it probably couldn't, anyway).
GC> 
GC> IIRC, in class currency, I used std::rint() only because it achieves
GC> better speed than its close relatives, by not setting and unsetting
GC> the rounding-direction bits. It seems weird to constexpr-ify a
GC> function whose behavior depends on run-time state.

 Yes, this is one of the main reasons this wasn't done in C++20, AFAIK.

GC> However, when we use rint(), we generally don't intend to demand
GC> rounding according to the current state

 In fact, we don't need rint() or rounding() here at all. We could use
modf() or remainder() just as well which are not affected by the rounding
mode. But, surprisingly, at least to me, these functions are indeed much
slower than rint(), a simple example using Nanobench[*] outputs

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                4.01 |      249,306,106.26 |    0.0% |      0.00 | `rint`
|                4.01 |      249,260,493.50 |    0.0% |      0.00 | `nearbyint`
|               29.69 |       33,679,597.47 |    0.0% |      0.00 | `fmod`
|               10.43 |       95,885,282.11 |    0.0% |      0.00 | `modf`
|               38.99 |       25,646,794.15 |    0.0% |      0.00 | `remainder`
|                2.41 |      415,368,639.67 |    0.0% |      0.00 | 
`static_cast`

(using "static_cast<int>(d) == d" is still the fastest way to perform this
check but it doesn't work for the doubles outside of int range, of course).

 But, interestingly enough, using constexpr arguments doesn't change
anything for fmod() but does make evaluating rint() and nearbyint() even
faster and, most weirdly, evaluates modf() and remainder() completely at
compile-time somehow (and static_cast too, but this is not a surprise),
even though these functions are not constexpr neither:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                1.20 |      830,939,149.28 |    0.0% |      0.00 | `rint`
|                1.20 |      830,921,926.33 |    0.0% |      0.00 | `nearbyint`
|               29.69 |       33,680,672.27 |    0.0% |      0.00 | `fmod`
|                   - |                   - |       - |         - | :boom: 
`modf` (iterations overflow. Maybe your code got optimized away?)
|                   - |                   - |       - |         - | :boom: 
`remainder` (iterations overflow. Maybe your code got optimized away?)
|                   - |                   - |       - |         - | :boom: 
`static_cast` (iterations overflow. Maybe your code got optimized away?)

I don't understand why does this happen, but I don't think it really helps
us anyhow, as we do need to perform this check during run-time (too), and
rint() or nearbyint() are still clear winners there, so I didn't spend time
on investigating this.

[*]: https://github.com/martinus/nanobench


GC> > GC>   view_ex.cpp://  static constexpr std::string unnameable{"Hastur"};
GC> [...]
GC> > GC> Maybe it'll work with a future gcc version.
GC> > 
GC> >  There are indeed plans to do it, AFAIR, but right now you have to write
GC> > your own constexpr string if you need it (as CTRE does).
GC> > 
GC> > GC> Until then, today's code:
GC> > GC>     static std::string const unnameable{"Hastur"};
GC> > GC> is plenty good enough.
GC> > 
GC> >  Yes, but it could still be replaced with constexpr string_view
GC> 
GC> I'd be disinclined to write the most optimal C++20 equivalent
GC> today, only to replace it with constexpr later.

 I think replacing "static string" with "constexpr string_view" here is a
quite reasonable change. It's really the same as using "Hastur" directly in
the statement below, except it preserves the symbolic name for it.


GC> > GC>   /// as "T{}" rather than "T(0)" because the latter uses an explicit
GC> > GC>   /// integer argument, which may require a converting constructor
GC> > GC>   /// (for example, with class currency).
GC> > GC> 
GC> > GC> so 'static constexpr' seems appropriate: the constant may be
GC> > GC> expensive to construct, so we want to construct it OAOO and
GC> > GC> store its value.
GC> > 
GC> >  Sorry, this is the (only) one change that I disagree with. Making it
GC> > static doesn't help help at all with constructing it once compared to
GC> > making it just constexpr: in the latter case it's constructed once at
GC> > compile-time, in the latter case it's constructed at compile-time but also
GC> > appears in the executable during run-time, which is needless. I.e.
GC> > "constexpr" alone is sufficient to construct it OAOO at compile-time and
GC> > while "static" would indeed be needed to store its value, we just don't
GC> > need to do this at all.
GC> 
GC> That's my question: is it truly constructed OAOO? [By the end of
GC> this post, I think I've constructed a valid argument that the
GC> initializing value is guaranteed to be computed at compile time,
GC> which isn't precisely the same thing, but is good enough for me.]

 I don't know how many times it's done (hopefully just once per type, but I
don't know how to check it), but it's only done during compile-, not run-,
time in this case and without "static" the variable just doesn't exist at
all during run-time, so its construction can't affect the execution time.

GC> Consider the sample program in this post:
GC>   https://stackoverflow.com/a/58328980

 Note that using the address of the variable changes things dramatically.
I don't think anything here applies to simple scalar constexpr variables
whose address is not taken.

GC> which I've compiled with pc-linux-gnu gcc-10.2, obtaining the
GC> following output...
GC> 
GC> value \ address of const is               1 78e
GC> value \ address of const is               2 72e
GC> value \ address of const is               3 6ce
GC> 
GC> value \ address of static is              1 320
GC> value \ address of static is              1 320
GC> value \ address of static is              1 320
GC> 
GC> value \ address of static const is        1 310
GC> value \ address of static const is        1 310
GC> value \ address of static const is        1 310
GC> 
GC> value \ address of constexpr is           0 78e
GC> value \ address of constexpr is           0 72e
GC> value \ address of constexpr is           0 6ce
GC> 
GC> value \ address of static constexpr is    0 4a0
GC> value \ address of static constexpr is    0 4a0
GC> value \ address of static constexpr is    0 4a0
GC> 
GC> ...which seems to suggest that a function-local 'constexpr' variable
GC> is initialized every time the function is called, but if we make it
GC> 'static constexpr', then it's initialized OAOO.

 Yes, this is the case in this example because taking the address of the
variable forces the compiler to actually create it in the executable.

GC> With 'static constexpr', there's no "maybe": OAOO is guaranteed.

 Yes, but it also guarantees the existence of the variable in the
executable, which we don't really want.

GC> I think you're saying that 'constexpr' alone guarantees that the
GC> initialization happens OAOO (unless we force that not to happen by
GC> perverse use of operator&). But can that be proven?

 Sorry, not sure. I definitely believe it should be the case and all my
tests confirm it, but this doesn't prove anything, of course.

GC> Here's why I feel somewhat queasy:
GC> 
GC>   https://isocpp.org/blog/2013/12/constexpr
GC> | constexpr guarantees compile-time evaluation *is possible if*
GC> | operating on a compile-time value, and that compile-time
GC> | evaluation *will happen if* a compile-time result is needed.
GC> 
GC> so in the lmi code we're discussing:
GC>   constexpr T zero {};
GC> we're guaranteed that compile-time evaluation is possible
GC> (because it doesn't depend on any non-compile-time value),
GC> but are we guaranteed that compile-time evaluation will
GC> occur?
GC> 
GC> Is the answer that this:
GC>   constexpr T zero {};
GC> is a "constexpr context"? so that the "*will happen if*" clause
GC> above necessarily comes into effect? and so that, no matter what
GC> initializer follows
GC>   constexpr T zero ...
GC> , if it compiles, then the initialization is guaranteed to
GC> occur at compile time? Or, more properly speaking, that the
GC> *initializing value* must be computed at compile time, which
GC> would be enough to satisfy me?

 Again, I think so, but I can't prove it right now. For functions, C++20
consteval made things much clearer, as it forces their evaluation in
compile-time context only (unlike constexpr which may evaluate them at
compile-time if possible, but doesn't have to), but consteval can't be
applied to variables. I guess we could replace "zero" local variable with
zero<T> consteval function, but I'm not sure if it would be really a gain
from visibility point of view (OTOH it would allow to define this function
only once instead of doing it in each function).

GC> In that case, then, specifying 'static' with 'constexpr' has no
GC> benefit (the initializing value is computed OAOO without 'static'),
GC> and adding 'static' has only the harmful effect of inducing a
GC> reasonable compiler to allocate an address for the variable,
GC> which impedes optimization?

 This is what I think, yes. And it's relatively easy to test that this is
what happens in each concrete example.

 Regards,
VZ

Attachment: pgpewxFvzXGg6.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]