[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Support non-ASCII URLs (Was: GNU wget 1.17.1 released)
From: |
Eli Zaretskii |
Subject: |
Re: [Bug-wget] Support non-ASCII URLs (Was: GNU wget 1.17.1 released) |
Date: |
Tue, 15 Dec 2015 19:15:04 +0200 |
> Date: Sun, 13 Dec 2015 20:04:31 +0100
> From: "Andries E. Brouwer" <address@hidden>
> Cc: "Andries E. Brouwer" <address@hidden>, address@hidden
>
> On Sun, Dec 13, 2015 at 08:01:27PM +0200, Eli Zaretskii wrote:
>
> > If no one is going to pick up the gauntlet, I will sit down and do it
> > myself, although I'm terribly busy with Emacs 25.1 release.
>
> Good!
OK, I'm ready to send the patch series. I tested it on GNU/Linux and
on MS-Windows, and it passed all my tests.
I will send the patch in 2 parts. This 1st part stops wget from
treating codepoints between 128 and 159 as control characters. This
only makes sense with ISO-8859 encodings, which are used by a tiny
minority of systems nowadays. Both UTF-8 and the Windows codepages
have printable characters and/or meaningful codes in that range that
must not be munged.
If we want to preserve back-compatibility in this respect, then a
variant of Tim's or Andries's patch could be used here, but the test
in it should be inverted: only if the locale's codeset is
ISO-8859-SOMETHING, we should tread these codepoints as control
characters. All the other codesets should pass these codes unaltered.
diff --git a/src/url.c b/src/url.c
index c62867f..d984bf7 100644
--- a/src/url.c
+++ b/src/url.c
@@ -1399,8 +1404,8 @@ UVWC, VC, VC, VC, VC, VC, VC, VC, /* NUL SOH STX ETX
EOT ENQ ACK BEL */
0, 0, 0, 0, 0, 0, 0, 0, /* p q r s t u v w */
0, 0, 0, 0, W, 0, 0, C, /* x y z { | } ~ DEL */
- C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, /* 128-143 */
- C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, /* 144-159 */
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 128-143 */
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 144-159 */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), (continued)
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/17
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Andries E. Brouwer, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] Support non-ASCII URLs (Was: GNU wget 1.17.1 released),
Eli Zaretskii <=
- Re: [Bug-wget] Support non-ASCII URLs, Eli Zaretskii, 2015/12/15
- Re: [Bug-wget] Support non-ASCII URLs, Giuseppe Scrivano, 2015/12/16
- Re: [Bug-wget] Support non-ASCII URLs, Eli Zaretskii, 2015/12/16
- Re: [Bug-wget] Support non-ASCII URLs, Tim Ruehsen, 2015/12/17
- Re: [Bug-wget] Support non-ASCII URLs, Giuseppe Scrivano, 2015/12/17
- Re: [Bug-wget] Support non-ASCII URLs, Eli Zaretskii, 2015/12/17
- Re: [Bug-wget] Support non-ASCII URLs, Tim Rühsen, 2015/12/17
- Re: [Bug-wget] Support non-ASCII URLs, Eli Zaretskii, 2015/12/17
- Re: [Bug-wget] Support non-ASCII URLs, Giuseppe Scrivano, 2015/12/18
- Re: [Bug-wget] Support non-ASCII URLs, Eli Zaretskii, 2015/12/18