|
From: | P. Jansen |
Subject: | Re: [Bug-wget] Wget follows 301/302's to excluded domains |
Date: | Sat, 07 Sep 2013 13:13:21 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 |
On 9/7/2013 12:51 PM, Giuseppe Scrivano wrote:
"P. Jansen" <address@hidden> writes:Is this a bug, a feature or am I doing something wrong?that is correct, the redirect is followed even if the location domain is not specified with -D. Both --domains and --exclude-domains are checked with links that are generated by wget when recursively fetching a web page. In other words, while the first URL is retrieved as effect to a HTTP redirect, wget shouldn't follow any other link to the new domain, that would be a bug.
Hi Giuseppe,If I interpret you correctly, you are stating this is a feature. I understand the HTTP protocol requires clients to follow redirects, but this behaviour leads to unwanted creation of directory structures. The redirect to the non-allowed domain is followed, the first contents are downloaded and a directory structure is created for the new non-allowed domain.
I would suggest creating an option preventing redirects to excluded domains. Kind regards, Pieter
[Prev in Thread] | Current Thread | [Next in Thread] |