grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 1/2] json: Add function to unescape JSON-encoded strings


From: Daniel Kiper
Subject: Re: [PATCH v5 1/2] json: Add function to unescape JSON-encoded strings
Date: Tue, 12 Jul 2022 15:39:13 +0200
User-agent: NeoMutt/20170113 (1.7.2)

On Mon, Jul 11, 2022 at 09:08:09AM -0400, Nicholas Vinson wrote:
> On 7/11/22 06:44, Patrick Steinhardt wrote:
> > JSON strings require certain characters to be encoded, either by using a
> > single reverse solidus character "\" for a set of popular characters, or
> > by using a Unicode representation of "\uXXXXX". The jsmn library doesn't
> > handle unescaping for us, so we must implement this functionality for
> > ourselves.
> >
> > Add a new function `grub_json_unescape ()` that takes a potentially
> > escaped JSON string as input and returns a new unescaped string.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >   grub-core/lib/json/json.c | 118 ++++++++++++++++++++++++++++++++++++++
> >   grub-core/lib/json/json.h |  12 ++++
> >   2 files changed, 130 insertions(+)
> >
> > diff --git a/grub-core/lib/json/json.c b/grub-core/lib/json/json.c
> > index 1c20c75ea..1eadd1ce9 100644
> > --- a/grub-core/lib/json/json.c
> > +++ b/grub-core/lib/json/json.c
> > @@ -262,3 +262,121 @@ grub_json_getint64 (grub_int64_t *out, const 
> > grub_json_t *parent, const char *ke
> >     return GRUB_ERR_NONE;
> >   }
> > +
> > +grub_err_t
> > +grub_json_unescape (char **out, grub_size_t *outlen, const char *in, 
> > grub_size_t inlen)
> > +{
> > +  grub_err_t ret = GRUB_ERR_NONE;
> > +  grub_size_t inpos, resultpos;
> > +  char *result;
> > +
> > +  if (out == NULL || outlen == NULL)
> > +    return grub_error (GRUB_ERR_BAD_ARGUMENT, N_("output parameters are 
> > not set"));
> > +
> > +  result = grub_calloc (1, inlen + 1);
> > +  if (result == NULL)
> > +    return GRUB_ERR_OUT_OF_MEMORY;
> > +
> > +  for (inpos = resultpos = 0; inpos < inlen; inpos++)
> > +    {
> > +      if (in[inpos] == '\\')
> > +   {
> > +     inpos++;
> > +     if (inpos >= inlen)
> > +       {
> > +         ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("expected escaped 
> > character"));
> > +         goto err;
> > +       }
> > +
> > +     switch (in[inpos])
> > +       {
> > +         case '"':
> > +           result[resultpos++] = '"';
> > +           break;
> > +
> > +         case '/':
> > +           result[resultpos++] = '/';
> > +           break;
> > +
> > +         case '\\':
> > +           result[resultpos++] = '\\';
> > +           break;
> > +
> > +         case 'b':
> > +           result[resultpos++] = '\b';
> > +           break;
> > +
> > +         case 'f':
> > +           result[resultpos++] = '\f';
> > +           break;
> > +
> > +         case 'r':
> > +           result[resultpos++] = '\r';
> > +           break;
> > +
> > +         case 'n':
> > +           result[resultpos++] = '\n';
> > +           break;
> > +
> > +         case 't':
> > +           result[resultpos++] = '\t';
> > +           break;
> > +
> > +         case 'u':
> > +           {
> > +             char values[4] = {0};
> > +             unsigned i;
> > +
> > +             inpos++;
> > +             if (inpos + ARRAY_SIZE(values) > inlen)
> > +               {
> > +                 ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("unicode 
> > sequence too short"));
> > +                 goto err;
> > +               }
>
> I recommend relaxing these errors a little bit. While technically
> correct, I'm thinking it might serve GRUB better to be a bit more
> forgiving.
>
> Perhaps something like:
>
> if the input is '\u????' where ? is not a hex digit, set the value to
> 'u', and process ???? separately.
>
> if ???? is less than ARRAY_SIZE chars, left-pad ???? with 0s* until it's
> the correct length then convert the value.
>
> (* I don't mean to literally pad. I mean to treat the short number as if
> it was left-padded with 0s)
>
> > +
> > +             for (i = 0; i < ARRAY_SIZE(values); i++)
> > +               {
> > +                 char c = in[inpos++];
> > +
> > +                 if (c >= '0' && c <= '9')
> > +                   values[i] = c - '0';
> > +                 else if (c >= 'A' && c <= 'F')
> > +                   values[i] = c - 'A' + 10;
> > +                 else if (c >= 'a' && c <= 'f')
> > +                   values[i] = c - 'a' + 10;
> > +                 else
> > +                   {
>
> With a similar argument here. Treat the short ???? as suggested above
> and resume normal processing at the non-hex digit position.
>
> > +                     ret = grub_error (GRUB_ERR_BAD_ARGUMENT,
> > +                                       N_("unicode sequence with invalid 
> > character '%c'"), c);
> > +                     goto err;
> > +                   }
> > +               }
> > +
> > +             if (values[0] != 0 || values[1] != 0)
> > +               result[resultpos++] = values[0] << 4 | values[1];
> > +             result[resultpos++] = values[2] << 4 | values[3];
> > +
> > +             /* Offset the increment that's coming in via the loop 
> > increment. */
> > +             inpos--;
> > +
> > +             break;
> > +           }
> > +
> > +         default:
> > +           ret = grub_error (GRUB_ERR_BAD_ARGUMENT, N_("unrecognized 
> > escaped character '%c'"), in[inpos]);
> > +           goto err;
>
> And the final prong of the suggestion would be here. Treat the improper
> escape as if it had not been escaped at all or as '\\?' where '?' is the
> escaped character.

Nicholas comments make sense for me.

Patrick?

Daniel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]