bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #61649] Wget not honouring Content-Encoding: gzip


From: anonymous
Subject: [bug #61649] Wget not honouring Content-Encoding: gzip
Date: Thu, 9 Dec 2021 08:43:44 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0

URL:
  <https://savannah.gnu.org/bugs/?61649>

                 Summary: Wget not honouring Content-Encoding: gzip
                 Project: GNU Wget
            Submitted by: None
            Submitted on: Thu 09 Dec 2021 01:43:42 PM UTC
                Category: Protocol Issue
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: Jim
        Originator Email: promptweb@gmail.com
             Open/Closed: Open
                 Release: 1.20
         Discussion Lock: Any
        Operating System: GNU/Linux
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: None

    _______________________________________________________

Details:

Web servers can be seen to deliver http data with Content-Encoding: gzip
whether it has been requested or not. 

This is notably observable with Amazon's cloudfront.

Amazon's behaviour is likely enabled by the fact all popular web browsers will
respond to the Content-Encoding: field on the response header at all times.

To test the behaviour:
wget --server-response
http://d3n8a8pro7vhmx.cloudfront.net/assets/liquid/v3/main-5c3aced637d54a6ea9c73f20ce4bdc9bdc1991e1a33ab6d8d3f489a734ae06f5.js

On the response header note:
Content-Encoding: gzip

The saved file name will be 
main-5c3aced637d54a6ea9c73f20ce4bdc9bdc1991e1a33ab6d8d3f489a734ae06f5.js 
Note: no gz suffix.

cat main-5c3aced637d54a6ea9c73f20ce4bdc9bdc1991e1a33ab6d8d3f489a734ae06f5.js
will return binary gzipped data

zcat main-5c3aced637d54a6ea9c73f20ce4bdc9bdc1991e1a33ab6d8d3f489a734ae06f5.js
will return the plaintext javascript.

If you run:
wget --server-response --compression=auto
http://d3n8a8pro7vhmx.cloudfront.net/assets/liquid/v3/main-5c3aced637d54a6ea9c73f20ce4bdc9bdc1991e1a33ab6d8d3f489a734ae06f5.js
The output will be as expected; a plaintext js file. 

The above behaviour is the same, irrespective of the location of the
cloudfront server, or whether http or https is used.

Suggested solution: wget --compression parameter should set the
Accept-encoding of the request header but irrespective of the request header
Accept-encoding field, a response-header Content-Encoding: gzip should always
be honoured and the data transparently uncompressed.




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?61649>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]