help-libidn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the idn programme - two bugs: -n switch, case folding


From: Simon Josefsson
Subject: Re: the idn programme - two bugs: -n switch, case folding
Date: Tue, 26 Aug 2008 10:53:41 +0200
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/22.2 (gnu/linux)

John McGowan <address@hidden> writes:

> 1: The "-n" switch.
>
>    echo "This.com" | idn -n
>      invalid option -- n

Hi!  Thanks for the report.

This is a case where the documentation was wrong, the intended short
parameter is -k and it should work fine.  However, fixing the
documentation means changing translated messages, so it appears simpler
to change the code so that -n really means --nfkc.  I've done so in:

http://git.savannah.gnu.org/gitweb/?p=libidn.git;a=commitdiff;h=c56cc0261209cbebeb4a0e07afd77d3abbc06ee9

>    echo "This.com" | idn --nfkc
>      This.com

This seems correct to me.

> 2: Case folding
>
>    echo "This.com" | idn --nfkc
>      This.com
>    
>    echo "This.com" | idn -s
>      this.com
>    
>    but
>    
>    echo "This.com" | idn -a
>      This.com
>    
>    Shouldn't the last be lower-cased?

No, the ToASCII operation doesn't use StringPrep/NamePrep if the input
is all ASCII, see the algorithm in RFC 3490 section 4.1:

   1. If the sequence contains any code points outside the ASCII range
      (0..7F) then proceed to step 2, otherwise skip to step 3.

   2. Perform the steps specified in [NAMEPREP] and fail if there is an
      error.  The AllowUnassigned flag is used in [NAMEPREP].

   3. If the UseSTD3ASCIIRules flag is set, then perform these checks:

     (a) Verify the absence of non-LDH ASCII code points; that is, the
         absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F.

     (b) Verify the absence of leading and trailing hyphen-minus; that
         is, the absence of U+002D at the beginning and end of the
         sequence.

   4. If the sequence contains any code points outside the ASCII range
      (0..7F) then proceed to step 5, otherwise skip to step 8.

   5. Verify that the sequence does NOT begin with the ACE prefix.

   6. Encode the sequence using the encoding algorithm in [PUNYCODE] and
      fail if there is an error.

   7. Prepend the ACE prefix.

   8. Verify that the number of code points is in the range 1 to 63
      inclusive.

For your input, the steps 1, 3, 4, and 8 will be executed.

Thanks,
Simon




reply via email to

[Prev in Thread] Current Thread [Next in Thread]