wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wget2 | html_url.c: fixed srcset data urls in html_get_url() (!483)


From: Tim Rühsen
Subject: Re: wget2 | html_url.c: fixed srcset data urls in html_get_url() (!483)
Date: Sun, 27 Dec 2020 14:20:21 +0000



Tim Rühsen started a new discussion on libwget/html_url.c: 
https://gitlab.com/gnuwget/wget2/-/merge_requests/483#note_474139735

>  
>                                       for (;len && c_isspace(*val); val++, 
> len--); // skip leading spaces
>                                       for (p = val;len && !c_isspace(*val) && 
> *val != ','; val++, len--); // find end of URL
> +
>                                       if (p != val) {
> +
> +                                             if (len && 
> !wget_strncasecmp_ascii(p, "data:", 5)) { // ignore comma in Data URLs (see 
> https://en.wikipedia.org/wiki/Data_URI_scheme)
> +                                                     for (val++, len--;len 
> && !c_isspace(*val) && *val != ','; val++, len--); // find end of URL

What if `*val` is not `,` ? It seems to me that there should be a check before 
changing `val` and `len`.

And that leads to the point where I would like to see unit tests for this code. 
Maybe start with a new file with the most obvious cases ? Best in `unit-tests/` 
and we should be able use `wget_html_get_urls_inline()` with HTML input and 
check for the expected parsed URLs after the call.

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/-/merge_requests/483#note_474139735
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]