[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: the idn programme - two bugs: -n switch, case folding
From: |
Simon Josefsson |
Subject: |
Re: the idn programme - two bugs: -n switch, case folding |
Date: |
Tue, 26 Aug 2008 10:53:41 +0200 |
User-agent: |
Gnus/5.110011 (No Gnus v0.11) Emacs/22.2 (gnu/linux) |
John McGowan <address@hidden> writes:
> 1: The "-n" switch.
>
> echo "This.com" | idn -n
> invalid option -- n
Hi! Thanks for the report.
This is a case where the documentation was wrong, the intended short
parameter is -k and it should work fine. However, fixing the
documentation means changing translated messages, so it appears simpler
to change the code so that -n really means --nfkc. I've done so in:
http://git.savannah.gnu.org/gitweb/?p=libidn.git;a=commitdiff;h=c56cc0261209cbebeb4a0e07afd77d3abbc06ee9
> echo "This.com" | idn --nfkc
> This.com
This seems correct to me.
> 2: Case folding
>
> echo "This.com" | idn --nfkc
> This.com
>
> echo "This.com" | idn -s
> this.com
>
> but
>
> echo "This.com" | idn -a
> This.com
>
> Shouldn't the last be lower-cased?
No, the ToASCII operation doesn't use StringPrep/NamePrep if the input
is all ASCII, see the algorithm in RFC 3490 section 4.1:
1. If the sequence contains any code points outside the ASCII range
(0..7F) then proceed to step 2, otherwise skip to step 3.
2. Perform the steps specified in [NAMEPREP] and fail if there is an
error. The AllowUnassigned flag is used in [NAMEPREP].
3. If the UseSTD3ASCIIRules flag is set, then perform these checks:
(a) Verify the absence of non-LDH ASCII code points; that is, the
absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F.
(b) Verify the absence of leading and trailing hyphen-minus; that
is, the absence of U+002D at the beginning and end of the
sequence.
4. If the sequence contains any code points outside the ASCII range
(0..7F) then proceed to step 5, otherwise skip to step 8.
5. Verify that the sequence does NOT begin with the ACE prefix.
6. Encode the sequence using the encoding algorithm in [PUNYCODE] and
fail if there is an error.
7. Prepend the ACE prefix.
8. Verify that the number of code points is in the range 1 to 63
inclusive.
For your input, the steps 1, 3, 4, and 8 will be executed.
Thanks,
Simon