[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wget2 | Add FTP & FTPS support (#3)
From: |
@rockdaboot |
Subject: |
Re: wget2 | Add FTP & FTPS support (#3) |
Date: |
Sun, 04 Jul 2021 18:07:47 +0000 |
Tim Rühsen commented on a discussion:
https://gitlab.com/gnuwget/wget2/-/issues/3#note_618221207
If I understand you correctly, you think that FTP in wget2 will be N times
faster than FTP in wget.
Given that the bottleneck is likely the network throughput, wget2 won't be
faster than wget. While wget2 may
download N files in parallel, each of them will be transferred with a speed of
bandwidth/N. The real time for this will be the same as transferring N files
sequential with full bandwidth.
The underlying assumptions here are that
1. the FTP server is not the bottleneck (if it is, parallel downloads can even
be slower than serial ones)
2. network bandwidth *or* disk write bandwidth is the bottleneck (not CPU)
3. the files are reasonable large as often seen in science (so RTTs from FTP
protocol communication are negligible)
4. the FTP server allows N parallel connections from the same IP
So there are only a few situations where Wget2 could improve the download time:
1. many small files to be downloaded
2. the list of FTP URLs contains more than one domain (the list could be split
by domain and several instances of wget coukld be started in parallel)
Wget2 has several improvements to speed up transfers of files. I'd say the
combination of HTTP/2 and compression is the biggest win over wget. (often it
is gzip, but some servers support brotli or zstd which are much better in terms
of compression ratios and decompression CPU usage than gzip).
Both are not available for FTP.
Slightly OT, but please consider any download via FTP (or any other non-secure
protocol) as tainted, even when downloading within your faculty. Since no one
ever checks the file integrity (this is tedious manual work), everybody should
use a secure channel for downloading. Or the other way round, you upload your
data via FTP - how do you make sure the server received the correct data ?
(Data-internal checksums do not help against malicious intent.)
Back to the topic... I previously proposed an extra tool like `wget2-ftp` to
keep the maintenance lower and scalable. The downside seems to be for recursive
website downloaders who also want all the referenced FTP sites being downloaded
in one go.
Another option is to write a plugin for the FTP protocol - it keeps the code
separate, the maintenance for libwget/wget2 would not increase (much). And the
FTP code could have it's own maintainer (scalable maintenance).
I happily help out any volunteer.
--
Reply to this email directly or view it on GitLab:
https://gitlab.com/gnuwget/wget2/-/issues/3#note_618221207
You're receiving this email because of your account on gitlab.com.
- Re: wget2 | Add FTP & FTPS support (#3),
@rockdaboot <=