[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wget | wget incorrectly checks filename length when mirroring files.
From: |
Darshit Shah (@darnir) |
Subject: |
Re: wget | wget incorrectly checks filename length when mirroring files. (#6) |
Date: |
Tue, 19 Oct 2021 16:37:14 +0000 |
Darshit Shah commented:
Thanks for the report!
>From a cursory glance it seems like you're absolutely right. So, if you would
>provide a patch, I'll merge it in asap. Else, I'll liwkly do it myself
>sometime around the weekend
<details><summary>...</summary>
On Tue, Oct 19, 2021, at 17:33, Paul Ferrell (@pflarr) wrote:
> Paul Ferrell <https://gitlab.com/pflarr> created an issue: #6
> <https://gitlab.com/gnuwget/wget/-/issues/6>
>
> From url.c
>
> `/* Calculate the length of the output string. e-b is the input
> string length. Each quoted char introduces two additional
> characters in the string, hence 2*quoted. */
> outlen = (e - b) + (2 * quoted);
> # ifdef WINDOWS
> max_length = MAX_PATH;
> # else
> max_length = get_max_length(dest->base, dest->tail, _PC_NAME_MAX);
> # endif
> max_length -= CHOMP_BUFFER;
> if (max_length > 0 && outlen > max_length)
> {
> logprintf (LOG_NOTQUIET, "The destination name is too long (%d),
> reducing to %d\n", outlen, max_length);
>
> outlen = max_length;
> }`
> When mirroring (and possibly in other situations) the output path is a
> relative path, not a single file name. `get_max_length` uses `pathconf`
> to get the max length of what can be placed at the given location.
> Unfortunately, there are two distinct limits that need to be checked,
> not one.
>
> 1. The length of the overall relative path, which can be checked with
> `_PC_PATH_MAX`.
> 2. The length of each component of the path, wich can be checked with
> `_PC_NAME_MAX`.
> By checking the whole relative path against just `_PC_NAME_MAX` you're
> limiting the entire relative path to the length limit for a single
> component of that path on the given system. For example, on a typical
> x86_64 Ubuntu box with an XFS filesystem, _PC_NAME_MAX is about 256
> bytes, but _PC_PATH_MAX is 4096 bytes.
>
> I ran into this when recursively mirroring a site with some
> particularly long filenames in a deep tree. Mirroring the same tree
> with wget2 doesn't seem to have any issues.
>
> —
> Reply to this email directly or view it on GitLab
> <https://gitlab.com/gnuwget/wget/-/issues/6>.
> You're receiving this email because of your account on gitlab.com. If
> you'd like to receive fewer emails, you can unsubscribe
> <https://gitlab.com/-/sent_notifications/REDACTED/unsubscribe>
> from this thread or adjust your notification settings.
</details>
--
Reply to this email directly or view it on GitLab:
https://gitlab.com/gnuwget/wget/-/issues/6#note_708004256
You're receiving this email because of your account on gitlab.com.