wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wget | wget should save directory listings as index.html (#11)


From: @rockdaboot
Subject: Re: wget | wget should save directory listings as index.html (#11)
Date: Wed, 25 May 2022 12:11:19 +0000



Tim Rühsen commented:


Yeah, this is an old issue derived from the fact that a web site not 
necessarily resembles a directory structure.

So there is an ambiguity here, for example `http://example.com/directory` and 
`http://example.com/directory/` often return two different pages. The latter is 
stored as `example.com/directory/index.html` by wget. Under what file name 
should wget store `http://example.com/directory` and 
`http://example.com/directory/index.html` now ?

What wget2 does is, if `directory` is a file and `directory/x` is going to be 
stored, `directory` is renamed to `directory.1` so that `directory/x` can be 
stored.  
Why not renaming `directory` to `directory/index.html` ? Because 
`directory/index.html` could be downloaded as well with a different content. 
And depending on which one wget downloads first, it would end up either 
(randomly ?) as `directory/index.html` or `directory/index.html.1`.

Would wget2's approach be feasible to you ?

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget/-/issues/11#note_959625178
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]