[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #60287] Windows recursive download escapes utf8 URLs twice
From: |
Cameron Tacklind |
Subject: |
[bug #60287] Windows recursive download escapes utf8 URLs twice |
Date: |
Fri, 26 Mar 2021 03:23:06 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36 |
Follow-up Comment #3, bug #60287 (project wget):
Thank you. I had not tried those options.
Curiously, the only option that I needed was *--local-encoding=utf8*. The
remote-encoding option did not change the detected URI encoding of CP1252.
*Without --local-encoding=utf8*
Loaded example.com/wget-test.html (size 71).
URI encoding = 'CP1252'
example.com/wget-test.html: merge('http://example.com/wget-test.html',
'space-ok%20cyrillic-not%D0%B3.txt') ->
http://example.com/space-ok%20cyrillic-not%D0%B3.txt
converted 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' (CP1252) ->
'http://example.com/space-ok cyrillic-notг.txt' (UTF-8)
appending 'http://example.com/space-ok%20cyrillic-not%C3%90%C2%B3.txt' to
urlpos.
*With --local-encoding=utf8*
Loaded example.com/wget-test.html (size 71).
URI encoding = 'utf8'
example.com/wget-test.html: merge('http://example.com/wget-test.html',
'space-ok%20cyrillic-not%D0%B3.txt') ->
http://example.com/space-ok%20cyrillic-not%D0%B3.txt
converted 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' (utf8) ->
'http://example.com/space-ok cyrillic-notг.txt' (UTF-8)
appending 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' to urlpos.
Regardless, this still feels like a bug to me. But maybe the issue is just how
wget implements the recursive download and isn't really fixable?
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?60287>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/25
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/25
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/25
- [bug #60287] Windows recursive download escapes utf8 URLs twice,
Cameron Tacklind <=
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/26
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/26
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/26
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/26
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/27
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/27
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/28
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/28
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Eli Zaretskii, 2021/03/29
- [bug #60287] Windows recursive download escapes utf8 URLs twice, Cameron Tacklind, 2021/03/29