[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] [bug #48232] Sometimes wget restarts download from the beginn
From: |
Evgeny Kapun |
Subject: |
[Bug-wget] [bug #48232] Sometimes wget restarts download from the beginning, even if the server supports resumed downloads |
Date: |
Wed, 15 Jun 2016 15:06:07 +0000 (UTC) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0 |
URL:
<http://savannah.gnu.org/bugs/?48232>
Summary: Sometimes wget restarts download from the beginning,
even if the server supports resumed downloads
Project: GNU Wget
Submitted by: abacabadabacaba
Submitted on: Wed 15 Jun 2016 06:06:05 PM MSK
Category: Program Logic
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Name:
Originator Email:
Open/Closed: Open
Discussion Lock: Any
Release: 1.18
Operating System: GNU/Linux
Reproducibility: Every Time
Fixed Release: None
Planned Release: None
Regression: None
Work Required: None
Patch Included: None
_______________________________________________________
Details:
If the connection is interrupted during download, wget normally tries to
continue the download from the same place where it left off. This only works
if the server supports resumed downloads, otherwise, download restarts from
the beginning. However, sometimes wget would restart the download even when
the server does support resumption. Testing shows that this happens if a
network error occurs before wget receives HTTP response from the server, a
situation which is quite common on poor networks.
For testing, I created a web server which would behave as follows:
* On the first request, it will send a response with `Content-Length: 1000`
and 500 bytes of data, then wait.
* On all other requests, it will just wait without sending any data.
Using wget to download from such server produces this:
$ wget --debug --timeout 1 --tries 4 'http://[::1]:8888/test'
Setting --timeout (timeout) to 1
Setting --tries (tries) to 4
DEBUG output created by Wget 1.18 on linux-gnu.
Reading HSTS entries from $HOME/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
converted 'http://[::1]:8888/test' (ANSI_X3.4-1968) ->
'http://[::1]:8888/test' (UTF-8)
Converted file name 'test' (UTF-8) -> 'test' (ANSI_X3.4-1968)
--2016-06-15 17:42:36-- http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.
---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.0 200 OK
Content-Length: 1000
---response end---
200 OK
Registered socket 4 for persistent reuse.
Length: 1000
Saving to: 'test'
test 50%[=========> ] 500 --.-KB/s in 1.0s
Disabling further reuse of socket 4.
Closed fd 4
2016-06-15 17:42:37 (500 B/s) - Read error at byte 500/1000 (Connection timed
out). Retrying.
--2016-06-15 17:42:38-- (try: 2) http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.
---request begin---
GET /test HTTP/1.1
Range: bytes=500-
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Retrying.
--2016-06-15 17:42:41-- (try: 3) http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.
---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Retrying.
--2016-06-15 17:42:45-- (try: 4) http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.
---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Giving up.
As you may see, only the second request includes `Range` header. Starting from
the third request, `Range` header is not included, so the download would not
be resumed at this point. In practice, this means that a big download would
suddenly restart because the network was down for some time, which is
undesirable.
I attached a test program to reproduce the issue. It listens on [::1]:8888 and
acts as a web server. You need to restart it before every test.
Related bugs:
* #31653: a fix for that bug is probably what introduced this bug. Read the
discussion there.
* #48123: a bug similar to this one, but there it is not clear that the server
supports resumed downloads.
_______________________________________________________
File Attachments:
-------------------------------------------------------
Date: Wed 15 Jun 2016 06:06:05 PM MSK Name: wget-test Size: 597B By:
abacabadabacaba
<http://savannah.gnu.org/bugs/download.php?file_id=37488>
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?48232>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [Bug-wget] [bug #48232] Sometimes wget restarts download from the beginning, even if the server supports resumed downloads,
Evgeny Kapun <=