bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

only the final 206 partial content response included in WARC file


From: Stephen
Subject: only the final 206 partial content response included in WARC file
Date: Sun, 30 Apr 2023 08:32:09 -0700
User-agent: Cyrus-JMAP/3.9.0-alpha0-374-g72c94f7a42-fm-20230417.001-g72c94f7a

Hi,

I believe I found a bug. While downloading a large file with wget, the 
connection failed multiple times. Wget retried with a range request until it 
had the entire file downloaded. In the resulting WARC file, all of the requests 
are present, but only the final partial response was saved.

I observed this behavior with Wget/1.21.3. Arguments were: "-O" "/dev/null" 
"--warc-file" "<redacted>" "--warc-cdx" "--warc-max-size=1G" "--input-file" 
"<redacted>"

This is pretty unfortunate, since it means that a section of the start of the 
file was just discarded by Wget. Please let me know if you'd like me to supply 
any additional information.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]