grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Make grub_strtoul() "end" pointer have the right const param


From: Nicholas Vinson
Subject: Re: [PATCH] Make grub_strtoul() "end" pointer have the right const params.
Date: Tue, 18 Feb 2020 22:39:07 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

On 2/18/20 19:32, Peter Jones wrote:
On Tue, Feb 04, 2020 at 08:04:30PM -0500, Nicholas Vinson wrote:
On 2/4/20 16:02, Peter Jones wrote:
grub_strtoul() and grub_strtoull() don't make the /pointer/ to "end" be
const like normal implementations do, and as a result, at many places in

grub_strtoul() and grub_strtoull() appear to be patterned after the C
standard functions strtoul() and strtoull().

According to C18, the signatures for those functions are:

     unsigned long int strtoul(const char * restrict nptr,
                               char ** restrict endptr, int base);

     unsigned long long int strtoull(const char * restrict nptr,
                                     char ** restrict endptr, int base);

It's true that the standard does not mandate any const usage on the
endptr parameter, and it does have restrict.  I'm happy to add restrict,
though I'm going to ignore it for the rest of this email, because the
standard literally says it changes no semantics.  Having "const" on both the
object pointed to by **endptr and on the outer pointer is still
compatible with the standard, though - nothing that's allowed by the
standard isn't allowed with those qualifiers.  Semantically, the
behavior is described in a way that is compatible with making those
const as well: the value written to *endptr must point inside the object

I'm afraid that's not true. That interpretation ignores the fact that the caller is permitted to use both nptr (provided the first argument is not const char *) and endptr to modify the underlying object. For example, the standard allows:

    char *num = "12345_";
    char *end;
    unsigned long long l;

    l = strtoull(num, &end, 10); // returns 12,345. end points to '_'

    **endptr = '6';

    l = strtoull(num, &end, 10); // returns 123,456. end points to '\0'

With the proposed modifications all such use cases would require explicit casts to suppress the warning gcc would generate.

pointed to by nptr, and neither function is allowed to modify that
object. >
Adding the const requirement does give us benefits though:
- It means when someone is modifying grub_strtoul() or grub_strtoull(),
   as inevitably happens every once in a while, making an error which
   would violate those requirements will cause an error building that
   change, rather than making it in to the tree.  That is, "const char **"
   protects us from making an error that writes modifies the object nptr
   points at, and "char ** const" protects us from accidentally changing
   the local pointer instead of the caller's pointer

'const char **' also requires the caller to cast away the const or use strange pointer arithmetic to modify the underlying object.

'char ** const' is probably OK semantically.

- Likewise, if someone isn't thinking through what they're doing and
   takes the const out in order to "fix" some perceived problem, that
   should alert whomever is reviewing the code.  In both of these

agreed.

   regards, having the const qualifiers helps us trust that our
   implementation is correct.
- It lets us pass in pointers that we *want* to be const from the
   caller's point of view.  It means that if a function calling strtoul()
   has a "const char *" argument, it can safely pass that as nptr as well
   as passing its address as endptr; without making the object const (i.e

If you add the 'restrict' keyword, then endptr cannot point to any part of the object nptr points to. The standard C functions have never allowed this usage which is why 'restrict' was added to the declarations in C99.

It's syntactically valid to do this with the GRUB variants (just like it is with C89), but it's not logically valid.

   at least "const char ** endptr"), that'll get you a warning about
   discarding the type qualifier.  That applies equally to cases that
   want to wrap these functions with another function that has similar
   semantics; having const qualifiers allows that function to declare
   those things const or not.  Not having the qualifiers here means the
   calling function also can't, which is where we wind up having a lot of
   the ugly code.

The wrapping functions can have all const parameters and still use the strto* functions without casting. Example below:

    long foo(const char *nptr, const char ** const endptr) {
        char *tmp
        long l;

        l = strtoul(nptr, &tmp, 10);
        *endptr = tmp;
        return l;
    }

Are there functions in GRUB that attempt to do this?

- The same applies to cases where the object is (or should be) genuinely
   immutable - for instance, data that's in .rodata, in data provided
   by an option ROM, or firmware configuration table.  It is better to be
   able to keep the type qualifier throughout the whole stack.
- It allows the optimizer to make assumptions about how we're using it
   which may allow it to generate more efficient output without having to
   analyze the caller.

Therefore, I recommend the GRUB maintainers keep the existing signatures for
grub_strtoul() and grub_strtoull() as they more closely follow the
signatures of the standard functions.

Obviously, I don't agree.

I also skimmed through the rest of the changes and it seems to me most of
the misuse of these calls could be solved simply by doing some variation of:

     const char *ptr = ...;
     char *end;

     strtoul(ptr, &end, 10);
     ptr = end;

While this is valid in many of the existing cases, it's not as good in
several ways.  In addition to the last three points I made above, this
also creates an unnecessary alias into the character object, which the
compiler then has to track.  In this case, since it is both a qualified
type of the same type of object and a character object, it doesn't
violate the aliasing rules ("-fstrict-aliasing -Wstrict-aliasing" isn't
allowed to complain), but it can cause the compiler to both take longer
and generate more, slower code than if that were:

   const char *ptr = ...;
   unsigned long val;

   val = strtoul(ptr, &ptr, 10);


As I stated above this is not logically valid.

That's true for both the caller and in the implementation of strtoul()
itself - though I don't think with current GCC it *will* cause the
strtoul() output to be much different, because it's pretty simple, but
for a more complex function like sscanf() calling it, the chances are a
bit rougher.  Hypothetically, it can also reduce the chances that it
winds up getting inlined if we were to use -flto.

It's also not logically valid to use sscanf() that way either. The only standard C function I can think of where it is logically valid to have source and destination buffers overlap is memmove().


which is what I would personally recommend instead.  If the compiler
issues an error or warning with the above, then that's a bug the
compiler maintainers should fix.

That's true, but it puts limits on the caller's usage and the compiler's
output >



reply via email to

[Prev in Thread] Current Thread [Next in Thread]