[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] Failing tests
From: |
Tim Ruehsen |
Subject: |
[Bug-wget] Failing tests |
Date: |
Thu, 02 Oct 2014 17:10:26 +0200 |
User-agent: |
KMail/4.14.1 (Linux/3.16-2-amd64; KDE/4.14.1; x86_64; ; ) |
Having a non "C" locale, Wget repeatable fails threee tests:
FAIL: Test-iri.px
FAIL: Test-iri-percent.px
FAIL: Test-iri-forced-remote.px
example (of course you must have en_US.utf8 installed):
TESTS_ENVIRONMENT="LC_ALL=en_US.utf8" make check
The simplest is Test-iri-percent.px, so i added -d to the Wget command line
and made two tests:
1. success with LC_ALL=C
$ cd tests
$ LC_ALL=C ./Test-iri-percent.px
#### snip ####
Loaded index.html (size 195).
URI encoding = 'ANSI_X3.4-1968'
index.html: merge('http://localhost:57052/',
'http://localhost:57052/hello_%E7\351.html') ->
http://localhost:57052/hello_%E7\351.html
Incomplete or invalid multibyte sequence encountered
appending 'http://localhost:57052/hello_%E7\351.html' to urlpos.
no-follow in index.html: 0
Deciding whether to enqueue "http://localhost:57052/hello_%E7�.html".
Decided to load it.
URI encoding = 'ISO-8859-15'
Enqueuing http://localhost:57052/hello_%E7\351.html at depth 1
Queue count 1, maxcount 1.
[IRI Enqueuing 'http://localhost:57052/hello_%E7\351.html' with 'ISO-8859-15'
Dequeuing http://localhost:57052/hello_%E7\351.html at depth 1
Queue count 0, maxcount 1.
--2014-10-02 16:39:13-- http://localhost:57052/hello_%E7%C3%A9.html
Reusing existing connection to localhost:57052.
Reusing fd 4.
---request begin---
GET /hello_%E7%C3%A9.html HTTP/1.1
#### snap ####
But did you see "Incomplete or invalid multibyte sequence encountered" ? This
indicates a wrong charset conversion though the test succeeds.
1. failure with LC_ALL=en_US.UTF-8
$ cd tests
$ LC_ALL=en_US.UTF-8 ./Test-iri-percent.px
#### snip ####
Loaded index.html (size 195).
URI encoding = ‘UTF-8’
index.html: merge(‘http://localhost:54675/’,
‘http://localhost:54675/hello_%E7\351.html’) ->
http://localhost:54675/hello_%E7\351.html
appending ‘http://localhost:54675/hello_%E7%E9.html’ to urlpos.
no-follow in index.html: 0
Deciding whether to enqueue "http://localhost:54675/hello_%E7%E9.html".
Decided to load it.
URI encoding = ‘ISO-8859-15’
Enqueuing http://localhost:54675/hello_%E7%E9.html at depth 1
Queue count 1, maxcount 1.
[IRI Enqueuing ‘http://localhost:54675/hello_%E7%E9.html’ with ‘ISO-8859-15’
Dequeuing http://localhost:54675/hello_%E7%E9.html at depth 1
Queue count 0, maxcount 1.
--2014-10-02 16:37:16-- http://localhost:54675/hello_%E7%E9.html
Reusing existing connection to localhost:54675.
Reusing fd 4.
---request begin---
GET /hello_%E7%E9.html HTTP/1.1
...
---response begin---
HTTP/1.1 400 Bad Request
...
#### snap ####
The iso-8859-15 URL should be de-percented, translated into UTF-8 and percent-
encoded before putting it into the GET request line. Looks like this hasn't
been done correctly.
I won't be much online the next three days, so maybe someone else could have a
look at the Wget sources !?
Tim
signature.asc
Description: This is a digitally signed message part.
- [Bug-wget] Failing tests,
Tim Ruehsen <=