[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GNUnet-developers] r30485
From: |
Christian Grothoff |
Subject: |
Re: [GNUnet-developers] r30485 |
Date: |
Thu, 31 Oct 2013 12:37:24 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20131005 Icedove/17.0.9 |
Well, r30485 did not change that assumption -- as you can see, the code
before _also_ simply assumed that 'output' was 'big enough'. And the
few places in the code that called this function all allocated 'output'
as strlen(input)+1, so in cases where utf8 toupper returns a longer
string, the code was incorrect before r30485 in the same way --- only
with the new API it might be more obvious that the caller is/was
expected to allocate output (and that this might be asking a bit much).
Maybe we should just change this API to *return* an allocated string
instead of passing 'output'? I don't quite understand why this API
was written like this to begin with -- returning the uppercase string
would seem more natural.
If you change this, please also change the tolower function in the same
way.
Happy hacking!
Christian
On 10/31/2013 12:09 AM, LRN wrote:
> On 30.10.2013 22:15, address@hidden wrote:
>> Author: grothoff
>> Date: 2013-10-30 19:15:48 +0100 (Wed, 30 Oct 2013)
>> New Revision: 30485
>
>> /**
>> - * Convert the utf-8 input string to uppercase
>> - * Output needs to be allocated appropriately
>> + * Convert the utf-8 input string to uppercase.
>> + * Output needs to be allocated appropriately.
>> *
>> * @param input input string
>> * @param output output buffer
>> */
>> void
>> -GNUNET_STRINGS_utf8_toupper(const char* input, char** output)
>> +GNUNET_STRINGS_utf8_toupper(const char *input,
>> + char *output)
>> {
>> uint8_t *tmp_in;
>> size_t len;
>
>> tmp_in = u8_toupper ((uint8_t*)input, strlen ((char *) input),
>> NULL, UNINORM_NFD, NULL, &len);
>> - memcpy(*output, tmp_in, len);
>> - (*output)[len] = '\0';
>> - free(tmp_in);
>> + memcpy (output, tmp_in, len);
>> + output[len] = '\0';
>> + free (tmp_in);
>> }
>
> u8_toupper allocates its output, then you copy it into the buffer that
> user provided, using the length that u8_toupper reported (not the actual
> length of the buffer).
>
> I'm not sure that this conversion always produces the output that has
> the same length as the input (which is, AFAIU, what you're relying on),
> not for all languages.
>
> The docs that i've found on UNINORM_NFD do not indicate (AFAICU) that
> this is some kind of special transform that guarantees the same (or
> less) number of bytes in the output.
>
>
> _______________________________________________
> GNUnet-developers mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gnunet-developers
>
0x48426C7E.asc
Description: application/pgp-keys
signature.asc
Description: OpenPGP digital signature