wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wget2 | Add FTP & FTPS support (#3)


From: Victor Mmr
Subject: Re: wget2 | Add FTP & FTPS support (#3)
Date: Sun, 21 Jun 2020 08:37:07 +0000



Victor Mmr commented:


Dear Wget2 developer team!

Please return the FTP protocol support to the Wget program. Although FTP is one 
of the oldest Internet protocols ever created, it still remains to be one of 
the most important net protocols even now. Perhaps FTP has some serious 
drawbacks due to it's old age (it's sad they haven't been fixed in the 
subsequent versions of the protocol), but FTP is very convenient for file 
transferring and for organizing simple file sharing services. HTTP cannot 
replace it, since it has no filetree roaming commands like cd, ls, pwd, FTP 
has. HTTP was being created for quite another goal - actually to make websites 
work, not for large file storages, kept as file directories on a server's hard 
drive with the interface of a filetree preserved. Even nowadays FTP is widely 
used by many old and new sites, lot's of quality valuable stuff is stoted on 
ftp-servers, and it seems there is no worthy substitution even now, that could 
replace it.

Another important reason to preserve FTP support in Wget2 is the following 
consideration. Wget is not intended just for fetching individual files. Wget is 
a tool widely used to clone web-sites partly or entirely by means of 
utilization of the -r or --recursive key. Old wget could cope with this task in 
a very limited way, as it didn't understand javascript and could not parse the 
code of the javascript scenarios and extract addresses from it. Thus old wget 
fit only to make copies of simple maincly static sites with no javascript 
insertions and no interactive web-pages. As far as I learned from your project, 
new Wget2 has an implementation of javascript parsers in a form of a plug-in. 
So Wget2 can clone modern websites like for example Httrack does, if the -r 
option is determined. However often web-sites have links to some files 
(documents in pdf and postscripts, archives containing sources and so on) which 
are situated on ftp-sites. It's a common practice when a website has an 
ftp-companion, whose IP-address and domain name are the same as of a website, 
and which works on the same server as the website. In such cases it's usual 
that all the preliminary information is situated on the web-server, while all 
documents and substantial materials are placed on the companion FTP, hence the 
http web-site and ftp-server form an organic whole, that cannot be dissociated. 
Whilst wget supported FTP it was possible to download recursively such hybrid 
http-ftp sites or http sites with links to ftp-contents by turning on the -H 
spanning hosts option. But after you broke the FTP support in your new 
generation Wget2 version of wget it would be quite impossible.

I think many users are looking at Wget2 as a future command-line replacement 
for many well-known offline browsers such as "HTTrack Website Copier" or a 
well-known in the Windows world "Teleport Pro/VLX". Teleport Pro is old and 
outdated (it was lasrt upgraded long long ago) and it's a commercial program 
that works only under Windows. HTTrack is a nice program, but it's unstable, 
has so many bugs, it's so imperfect and it seems it's creator, a French 
programmer has abandoned it's development. On the other hand, wget1 is a 
reliable neat program, with reach, but clear and convenient syntax. It's 
advantage is it does exactly the thing you asked it to do, no more and no less. 
As wget1 could not understand javascript it was not able to fetch sites 
recursively in multiple streams (using several connections), it could not serve 
a full-feteared replacement to offline browsers and modern GUI download 
managers. After it gained all these features (Javascript parser as a plug-in, 
multiconnectional download, multithreading inner architecture), if they are 
implemented completely and qualitatively, it can replace these 
offline-browsers. It seems, when introducing these features to this utility, 
you took in mind the possibility to use the program as an offline-browser and a 
full-featured down-load manager. However killing FTP-support in your new 
generation version of the program, you are cutting severely it's scope, it's 
possibilities, it's universality, without an FTP it will loose severely.

By the way, curl utility and libcurl support FTP along with many other net 
protocols (some of them are not really necessary, but still they are present in 
the bundle!) But as far as I know curl does not allow to fetch sites 
recursively (it has nothing similar to wget's -r option). HTTrack supports FTP 
protocol too, but unfortunately it's so buggy and has some foolish limitations 
like 250 kbit/sec traffic restriction. Wget2 tends to become a replacement for 
many utilities of the sort, but throwing away an FTP protocol from it will 
cripple it. HTTP and FTP contents are so closely tied to each other, that it's 
impossible to make a good recursive file downloader, by splitting support of 
both protocols between two command-line clients. If it's not integrated in one 
client, but distributed among two utilities, it won't just work.

I think a multipurpose file downloader and site-cloner has to support various 
versions of HTTP-protocol, new and obsolete, unsafe unencripted and secured by 
TLS or SSL, and on the other hand it has to support old FTP, both deprived of 
any protection through encription and protected through TLS and SSL. Although a 
good prograam of that kind should support similar to FTP modern protocols, that 
have derived from it, such as SFTP and FTPS..

Please, consider my thoughts. I think throwing out FTP protocol from wget will 
be a great loss for the Wget program.

Respectively yours,
Victor.

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/issues/3#note_365125535
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]