grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] http: parse HTTP headers case-insensitive


From: Javier Moragon
Subject: Re: [PATCH v2] http: parse HTTP headers case-insensitive
Date: Mon, 17 Jan 2022 20:54:30 +0100

>> +is_header (char *ptr, const char* name)
> You might want to return a pointer to the first character in the header
> value instead. That way, there would be no need to look for the ':'
> character in is_header_value().

IMO that would feel strange with the current meaning of the functions.
Maybe we could just have a helper function in order to parse a header
and return the pointers with its length (in order to ignore LWS).
so it could be something like:

typedef struct http_header
{
  char *name;
  char *value;
};

static *http_header_t
parse_header (char *ptr)
{
  // put \0 on ":" for name and first lws for value.
  // as we are editing this memory before for removing CRLF
  // or we can just get ptr + length for both chars in http_header struct
  // if it fails parsing header, return null pointer
}

header = parse_header (ptr);
if (!header)
  return ...;

if (grub_strcasecmp (header->name, "Whatever") == 0)
{
    file->size grub_strtoull (header->value);
    // strtoull ignores lws until it reaches digits (so it will ignore
trailing lws too)
}

> Watch out here. This won't skip the white space between the header name
> and value. All of these are valid cases:
>
> Content-length:123
> Content-Length: 234
> Content-Length:     678

According to grub_strtoull implementation and my own OS strtoull
implementation where I run some unit tests, it ignores all non-digit
characters, then it parses digit-characters and stops parsing when a
non-digit character or \0 is found.

> Also, what is the behavior if an invalid length like "Content-Length:
> WHATEVER" is present? Should the header be ignored, or should the HTTP
> request fail?
I'm not sure if it is worth getting protected from this case.
IMO if there's not bad intention the only case that we could find is
lower-case headers or capitalized.
In the worst scenario, grub_strtoull will return 0ULL if it doesn't find digits.

> You could use a MACRO() here to pass the string constant and its length
> to avoid the runtime length calculation in the is_header() function. I
> am not familiar with GRUB guidelines for using the C pre-compiler, but
> something like this might work:
>
> #define STRING_AND_LENGTH(str) (str),(sizeof(str)-1)
I'm new with the GRUB code, what kind of implication could we have to
calculate string length in runtime if we can increase the code
readability for parsing few headers?

I'm sorry to have so many doubts or to propose things that maybe don't
fit, I want to understand it well in order to make better
contributions :).

Thank you!

El lun, 17 ene 2022 a las 11:54, Stephen Balousek
(<stephen@balousek.net>) escribió:
>
> Hi Jamo,
>
> I like seeing these improvements to HTTP handling. I made a bunch of
> comments below. Hopefully one or two of them are helpful.
>
> - Steve
>
> On 1/16/2022 5:54 PM, Jamo wrote:
> > According to https://www.ietf.org/rfc/rfc2616.txt 4.2, header names
> > shall be case insensitive and we are now forced to read headers like
> > "Content-Length" capitalized.
> >
> > The problem with that is when a HTTP server responds with a
> > "content-length" header in lowercase, GRUB gets stuck because HTTP
> > module doesn't know the length of the transmission and the call never
> > ends.
> > ---
> > v2:
> >      compare header value ignoring lws
> >      content-size value parsing should start after 'Content-Size:'
> >      extract check header and its value in two functions
> >
> > First of all, thank you for helping me how to contribute sending
> > patches through mail and with your suggestions.
>
> As a newbie to the group myself, I have to soundly second this comment.
> The folks on this mailing list are super!
>
> > I applied the suggestions you told about and I extracted that logic into two
> > new static functions in order to increase code readability.
> >
> > I know that sizeof("inline string") would have better performance
> > if I have done it inline but if I try to apply it inside the extracted 
> > function
> > it will always return the size of the bigger const string passed to the
> > function. I think that kind of optimization here it doesn't worth VS code
> > readability, we are not going to deal with a large number of headers.
> >
> > I still not very sure about the naming of "is_header" and
> > "is_header_value". And "is_header_value" is only valid when it is a header
> > without multiple values. As far as I understand if we had headers with 
> > multiple
> > values we should admit multi-line values starting with LWS, to have the 
> > header
> > name more than once, to parse elements by commas...
> >
> > I think if we have to deal with that in the future the code could
> > be refactored instead of doing it now.
> >
> > I have another doubt, I see that the project has some unit tests
> > but the http module is all static functions. I've been doing
> > these unit tests out of the project with the two new functions
> > I added trying the possible cases succesfully.
> >
> > Should I adapt the code in order to be testable and include
> > the tests that confirms my patch works?
> >
> > Thank you very much!
> >
> >   grub-core/net/http.c | 41 +++++++++++++++++++++++++++++++++++------
> >   1 file changed, 35 insertions(+), 6 deletions(-)
> >
> > diff --git a/grub-core/net/http.c b/grub-core/net/http.c
> > index b616cf40b..aed40f536 100644
> > --- a/grub-core/net/http.c
> > +++ b/grub-core/net/http.c
> > @@ -62,6 +62,37 @@ have_ahead (struct grub_file *file)
> >     return ret;
> >   }
> >
> > +static int
> > +is_header (char *ptr, const char* name)
> You might want to return a pointer to the first character in the header
> value instead. That way, there would be no need to look for the ':'
> character in is_header_value().
> > +{
> > +  grub_size_t length = grub_strlen (name);
> > +  return grub_strncasecmp (name, ptr, length) == 0 && ptr[length] == ':';
> > +}
> > +
> > +static int
> > +is_header_value (char *ptr, const char* value)
> > +{
> > +  char *ptr_start = ptr;
> > +  char *ptr_end = ptr + strlen (ptr);
> > +  grub_size_t value_length = strlen (value);
> > +
> > +  while(ptr_start && *ptr_start != ':')
> > +    ptr_start++;
> > +
> > +  if (*ptr_start == ':')
> > +    ptr_start++;
> > +
> > +  while (grub_isspace (*ptr_start))
> > +    ptr_start++;
> > +  while (grub_isspace (ptr_end[-1]))
> > +    ptr_end--;
> > +
> > +  if (value_length != (grub_size_t)(ptr_end - ptr_start))
> > +    return 0;
> > +
> > +  return strncasecmp (value, ptr_start, value_length) == 0;
> > +}
> > +
> >   static grub_err_t
> >   parse_line (grub_file_t file, http_data_t data, char *ptr, grub_size_t 
> > len)
> >   {
> > @@ -130,18 +161,16 @@ parse_line (grub_file_t file, http_data_t data, char 
> > *ptr, grub_size_t len)
> >         data->first_line_recv = 1;
> >         return GRUB_ERR_NONE;
> >       }
> > -  if (grub_memcmp (ptr, "Content-Length: ", sizeof ("Content-Length: ") - 
> > 1)
> > -      == 0 && !data->size_recv)
> > +  if (is_header (ptr, "Content-Length") && !data->size_recv)
>
> You could use a MACRO() here to pass the string constant and its length
> to avoid the runtime length calculation in the is_header() function. I
> am not familiar with GRUB guidelines for using the C pre-compiler, but
> something like this might work:
>
> #define STRING_AND_LENGTH(str) (str),(sizeof(str)-1)
>
> and then
>
> if (is_header (ptr,STRING_AND_LENGTH("Content-Length")) && !data->size_recv)
>
> >       {
> > -      ptr += sizeof ("Content-Length: ") - 1;
> > +      ptr += sizeof ("Content-Length:") - 1;
> >         file->size = grub_strtoull (ptr, (const char **)&ptr, 10);
>
> Watch out here. This won't skip the white space between the header name
> and value. All of these are valid cases:
>
> Content-length:123
> Content-Length: 234
> Content-Length:     678
>
> As Daniel was nice enough to point out in my work on parsing integers,
> grub_stroull() usage can be tricky. Will that function correctly skip
> over white space here?
>
> Also, what is the behavior if an invalid length like "Content-Length:
> WHATEVER" is present? Should the header be ignored, or should the HTTP
> request fail?
>
> >         data->size_recv = 1;
> >         return GRUB_ERR_NONE;
> >       }
> > -  if (grub_memcmp (ptr, "Transfer-Encoding: chunked",
> > -                sizeof ("Transfer-Encoding: chunked") - 1) == 0)
> > +  if (is_header (ptr, "Transfer-Encoding"))
> >       {
> > -      data->chunked = 1;
> > +      data->chunked = is_header_value (ptr, "chunked");
> >         return GRUB_ERR_NONE;
> >       }
> >
>
> Have a great day!
> - Steve
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]