wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wget | Slashes Repeated Eternally while Generating Adresses during Recur


From: Victor Mmr (@victormmr)
Subject: wget | Slashes Repeated Eternally while Generating Adresses during Recursive Site Download (#8)
Date: Mon, 27 Dec 2021 09:10:46 +0000


Victor Mmr created an issue: https://gitlab.com/gnuwget/wget/-/issues/8



Dear wget developer team!

An issue was detected while trying to fetch the http://book.itep.ru website 
with the wget utility.

When site download was launched with the command

`    wget -r -l 0 -X /depository -p -k 'http://book.itep.ru/1/intro1.htm'`

links to download like

```
    http://book.itep.ru/.....
    http://book.itep.ru//.....
    http://book.itep.ru///.....
    http://book.itep.ru////.....
    http://book.itep.ru/////.....
```

were produced by wget.

Thus, slashes after the site domain name were duplicated. After the site fetch 
cycle with '/' was over, a cycle with '//' adresses started. When it was over, 
a new cycle with '///' links was started, and so on. So, the download process 
turned to be eternal and it never was finished. Obviously the URLs with '/', 
'//', ...'/////' meant quite the same adresses, but this non-stopped process of 
duplicated slashes producing prevented the fetch to be finished. I don't know 
why these repeated slashes appeared, but two other site download utilities - 
"Teleport VLX" and "HTTrack Website Copier" succeeded in fetching this site.

The issue was reproduced both under "MS Windows" OS and "Debian Linux".

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget/-/issues/8
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]