bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #60494] Percent character in filename gets escaped twice


From: Petr Pisar
Subject: [bug #60494] Percent character in filename gets escaped twice
Date: Sun, 16 May 2021 11:59:57 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0

Follow-up Comment #6, bug #60494 (project wget):

You cannot state a question like that because a random string is ambiguous by
it's nature.

According to the specification
<https://datatracker.ietf.org/doc/html/rfc3986#section-2.4> there is nothing
as an unescaped URI. URI is always escaped by the definition.

Look at the original report: You have a file name
"qtwebengine-everywhere-src-5.15.2-%231904652.patch.gz". It's a file name. Not
an URI. If you construct a URL for the file name using an
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/";
base URL, then you need to escape the file name string and then append it it
after a path delimiter of the base URL. I.e. you convert the file name to
"qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz" and then append it
to the base resulting into
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz";
URL. This URL is passed to to wget command. Thus wget should not escape it
again. It could validate and report an error. But not escape it.

I will quote the specification here:

   Under normal circumstances, the only time when octets within a URI
   are percent-encoded is during the process of producing the URI from
   its component parts.  This is when an implementation determines which
   of the reserved characters are to be used as subcomponent delimiters
   and which can be safely used as data.  Once produced, a URI is always
   in its percent-encoded form.

Please, pay attention to the last sentence.

Of course wget could state that its argument is a byte stream without any
other constrains. But a manual of wget(1) reads something different, it states
it's a URL:

SYNOPSIS
       wget [option]... [URL]...

Hence wget should not attempt any escaping.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60494>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]