wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wget | wget incorrectly checks filename length when mirroring files.


From: Darshit Shah (@darnir)
Subject: Re: wget | wget incorrectly checks filename length when mirroring files. (#6)
Date: Tue, 19 Oct 2021 16:37:14 +0000



Darshit Shah commented:


Thanks for the report!

>From a cursory glance it seems like you're absolutely right. So, if you would 
>provide a patch, I'll merge it in asap. Else, I'll liwkly do it myself 
>sometime around the weekend

<details><summary>...</summary>

On Tue, Oct 19, 2021, at 17:33, Paul Ferrell (@pflarr) wrote:
> Paul Ferrell <https://gitlab.com/pflarr> created an issue: #6 
> <https://gitlab.com/gnuwget/wget/-/issues/6> 
>
> From url.c
>
> `/* Calculate the length of the output string.  e-b is the input
>    string length.  Each quoted char introduces two additional
>    characters in the string, hence 2*quoted.  */
> outlen = (e - b) + (2 * quoted);
> # ifdef WINDOWS
>   max_length = MAX_PATH;
> # else
>   max_length = get_max_length(dest->base, dest->tail, _PC_NAME_MAX);
> # endif
>   max_length -= CHOMP_BUFFER;
>   if (max_length > 0 && outlen > max_length)
>     {
>       logprintf (LOG_NOTQUIET, "The destination name is too long (%d), 
> reducing to %d\n", outlen, max_length);
>
>       outlen = max_length;
>     }` 
> When mirroring (and possibly in other situations) the output path is a 
> relative path, not a single file name. `get_max_length` uses `pathconf` 
> to get the max length of what can be placed at the given location. 
> Unfortunately, there are two distinct limits that need to be checked, 
> not one.
>
>  1. The length of the overall relative path, which can be checked with 
> `_PC_PATH_MAX`.
>  2. The length of each component of the path, wich can be checked with 
> `_PC_NAME_MAX`.
> By checking the whole relative path against just `_PC_NAME_MAX` you're 
> limiting the entire relative path to the length limit for a single 
> component of that path on the given system. For example, on a typical 
> x86_64 Ubuntu box with an XFS filesystem, _PC_NAME_MAX is about 256 
> bytes, but _PC_PATH_MAX is 4096 bytes.
>
> I ran into this when recursively mirroring a site with some 
> particularly long filenames in a deep tree. Mirroring the same tree 
> with wget2 doesn't seem to have any issues.
>
>
> Reply to this email directly or view it on GitLab 
> <https://gitlab.com/gnuwget/wget/-/issues/6>. 
> You're receiving this email because of your account on gitlab.com. If 
> you'd like to receive fewer emails, you can unsubscribe 
> <https://gitlab.com/-/sent_notifications/REDACTED/unsubscribe> 
> from this thread or adjust your notification settings.

</details>

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget/-/issues/6#note_708004256
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]