[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Detected bugs by bugprobe
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] Detected bugs by bugprobe |
Date: |
Mon, 23 Sep 2013 09:30:47 +0200 |
User-agent: |
KMail/4.10.5 (Linux/3.10-3-amd64; KDE/4.10.5; x86_64; ; ) |
On Friday 20 September 2013 21:32:23 Darshit Shah wrote:
> Hi Yan,
>
> Thanks for the analysis.
>
> The followings are some bugs found by our static analysis tool bugprobe in
>
> > wget-1.14.
> > In file recur.c
> > At line 281, function url_parse may return NULL assigned to url_parsed.
> > Function retrieve_url use url_parsed without null check.
> >
> >
> > All program points that call function url_parse also exist this kind of
> > bugs.
> >
> > I think this is already handled, since I tried making the code fail
>
> through a few test cases, but couldn't. Even if url_parse returns a NULL
> value, wget exits gracefully with a meaningful error message.
>
> I think I remember seeing some code that handled this issue, but I'm not
> sure exactly where. Probably, the issue isn't handled correctly, and there
> may be an edge case. Will have to look into the code thoroughly to check it
> out.
Even if Wget doesn't crash right now, there is a pitfall for contributors.
And in that, Yan Zhang's analyzer is right.
Just have a look at recur.c lines 278-281 (current git master):
struct url *url_parsed = url_parse (url, &url_err, i, true);
status = retrieve_url (url_parsed, url, &file, &redirected, referer,
&dt, false, i, true);
url_parse() returns NULL on every problem it finds. This is by definition, not
by accident.
BUT retrieve_url() does not check url_parsed for NULL and is likely to crash
if that ever happens...
How should a static analyzer know, that Wget performs some checks somewhere
else to make sure this situation does not happen ? How should a contributor
oversee this situation ?
It is a matter of code quality to check the return value of url_parse() before
it gets worked on / passed to a another function, and be it just to make the
situation clearer. And btw, it helps avoiding analyzers to produce false
positives.
Another point is that the above code obviously relies on url_parse() having
been called earlier (uh, was that grammar correct ?)... so why not putting
that result into a hashtable to avoid calling url_parse() two times ?
Tim