wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wget2 | html_url.c: fixed srcset data urls in html_get_url() (!483)


From: @rockdaboot
Subject: Re: wget2 | html_url.c: fixed srcset data urls in html_get_url() (!483)
Date: Sun, 30 May 2021 17:09:53 +0000



Tim Rühsen commented on a discussion on libwget/html_url.c: 
https://gitlab.com/gnuwget/wget2/-/merge_requests/483#note_588477562

>  
>                                       for (;len && c_isspace(*val); val++, 
> len--); // skip leading spaces
>                                       for (p = val;len && !c_isspace(*val) && 
> *val != ','; val++, len--); // find end of URL
> +
>                                       if (p != val) {
> +
> +                                             if (len && 
> !wget_strncasecmp_ascii(p, "data:", 5)) { // ignore comma in Data URLs (see 
> https://en.wikipedia.org/wiki/Data_URI_scheme)
> +                                                     for (val++, len--;len 
> && !c_isspace(*val) && *val != ','; val++, len--); // find end of URL

Did you ever saw a data: URL (or better URI) within a srcset attribute ?
It looks like any parser must be aware of the data: URI scheme, as it uses 
'comma' as delimiter, while srcset also uses 'comma' as delimiter. Which seems 
a really bad choice... I mean there could be any other URI schemes that use 
comma as well, and we are not aware of it. It looks like a design flaw.

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/merge_requests/483#note_588477562
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]