wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wget2 | "utf8" charset breaks urls with invalid utf-8 (#523)


From: .
Subject: wget2 | "utf8" charset breaks urls with invalid utf-8 (#523)
Date: Wed, 22 Apr 2020 01:20:02 +0000


_ created an issue: https://gitlab.com/gnuwget/wget2/-/issues/523



i'm using wget2 1.99.2 on gentoo

```html
<!doctype html>
<meta charset="utf8">
<img src="/test%E4.jpg">
```

when trying to `wget2 --mirror` a page containing this, it'll refuse to fetch 
the image with the following output:

```
Failed to transcode 'utf8' string into 'utf-8' (84)
URL '/test%E4.jpg' not followed (conversion failed)
```

if the `charset="utf8"` is replaced with `charset="utf-8"`, the image downloads 
without problems

old wget successfully downloads it with both charset values (and chromium would 
display the image in both cases too)

build configuration from `-V`: `+digest +https +ssl/gnutls +ipv6 +iri 
+large-file +nls -ntlm -opie +psl -hsts +iconv +idn2 +zlib +lzma +brotlidec 
+zstd +bzip2 +http2 -gpgme`

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/issues/523
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]