[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] wget alpha release 1.14.96-38327
From: |
Andries E. Brouwer |
Subject: |
Re: [Bug-wget] wget alpha release 1.14.96-38327 |
Date: |
Tue, 7 Jan 2014 21:14:27 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Tue, Jan 07, 2014 at 09:54:46PM +0530, Darshit Shah wrote:
> Anything still blocking the release?
>
> 12 month release cycle sounds good to me. I'm trying to replicate the
> aforementioned issues, but no luck still.
wget still saves filenames in a buggy way:
$ echo $LC_CTYPE
en_US.UTF8
$ wget -r -np http://jinix.sourceforge.net/go/sgf/01.诘棋总动员/育苗工程手筋300题/index.html
...
Total wall clock time: 42s
Downloaded: 301 files, 106K in 0.2s (427 KB/s)
$ ls jinix.sourceforge.net/go/sgf/01.诘棋总动员
ls: cannot access jinix.sourceforge.net/go/sgf/01.诘棋总动员: No such file or
directory
$ ls jinix.sourceforge.net/go/sgf
01.??%98??%8B?%80??%8A??%91%98
The filename here is strange and messy. It cannot be typed
on this system: it is UTF-8 but in the middle of the UTF-8 characters
some bytes have been escaped as if they were high ISO-8859-1 bytes.
The result is valid in no character set.
The only thing one can do is
% rm -r jinix.sourceforge.net
% wget --restrict-file-names=nocontrol ...
throwing away this default wget output, finding the option wget needs
to do the right thing, and starting all over again.
$ ls jinix.sourceforge.net/go/sgf/01.诘棋总动员
育苗工程手筋300题
Now it works.
Andries