Re: idna_to_unicode_8z8z() takes a stroll through the heap

help-libidn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: idna_to_unicode_8z8z() takes a stroll through the heap

From:	Simon Josefsson
Subject:	Re: idna_to_unicode_8z8z() takes a stroll through the heap
Date:	Wed, 05 Jun 2013 23:14:10 +0200
User-agent:	Gnus/5.130006 (Ma Gnus v0.6) Emacs/24.3 (gnu/linux)

Sam Varshavchik <address@hidden> writes:

>         char *p=strdup("example.com\xe3");
>         err=idna_to_unicode_8z8z(p, &utf8_ptr, 0);
...
> When g_utf8_next_char() gets 0xe3, this loop will merrily skip over
> the trailing \0 in the C string, and off it goes, into merry-land.

Right.

> Yes, idna_to_unicode_8z8z() is documented as taking valid UTF-8 for
> input. But, is it unreasonable for me to take an address from an
> E-mail header, and feed it to idna_to_unicode_8z8z(), without having
> to validate it for properly UTF-8ness?

It has to be validated for proper UTF-8-ness.  The UTF-8 functions in
libidn (copied from glib) assume valid UTF-8 strings.

I agree it is way too easy to end up using libidn the way you did.  I'm
split between improving documentation to explain the issue or add input
sanitization to all libidn functions accepting UTF-8 data.  I know IDNA
operations is a performance bottleneck in some environments, and
validating UTF-8 takes some CPU time.  But probably not that much
though...

/Simon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: idna_to_unicode_8z8z() takes a stroll through the heap, Simon Josefsson <=

Prev by Date: Re: Question regarding incomplete UTF-8 arguments.
Next by Date: Re: Decoding ACE created by libidn2...
Previous by thread: Re: Question regarding incomplete UTF-8 arguments.
Index(es):
- Date
- Thread