qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Auditing QEMU to replace NULL with &error_abort


From: Markus Armbruster
Subject: Re: Auditing QEMU to replace NULL with &error_abort
Date: Wed, 23 Jun 2021 14:16:55 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

John Snow <jsnow@redhat.com> writes:

> One of our Bite-Sized tasks on the wiki was to audit QEMU and, where
> applicable, replace NULL with &error_abort.

Context: NULL argument means "do not pass back an error object".

This has two uses:

1. When the function permits detecting failure in another way (typically
   by returning distinct error value(s)), and the caller is not
   interested in additional detail an error object provides.

2. When the caller ignores errors.

In both cases, the caller can pass NULL to save itself the trouble of
receiving and freeing a (useless) error object.

Our most common *actual* use, however, is

3. When the call site assumes errors can't happen (and breaks when they
do).

This is wrong.  The caller should pass &error_abort instead.

> Everywhere else where it is intentional, we ought to add a comment or
> some other indication explaining why it's the right thing to do in
> that case.
>
> That task was ported to GitLab here:
> https://gitlab.com/qemu-project/qemu/-/issues/414
>
> mreitz and thuth have chimed in with excellent clarifications. Phil
> suggests that we should replace the intentional cases of NULL with 
> &error_ignore, to possibly log squelched errors in debugging
> mode. This sounds like a great idea to me:
>
> - It allows us to remove NULL entirely, which as mreitz states "is
>   fishy", but sometimes valid.

It's perfectly valid when all you'd do with an error object was to throw
it away.

Whether you do that by passing NULL or some other special value doesn't
matter all that much.  NULL is what GError uses.  It has turned out to
be prone to misuse, but I suspect any other value would be just as
prone.

> - It annotates callsites where we have decided the ignore is
>   intentional and correct.

A fresh special value like &error_ignore can serve as "we really mean to
ignore the error object here" signal, but only until people start
misusing it the same way NULL is being misused.

> - It gives us a review opportunity to require good comments at those
>   callsites.

Requiring a use of &error_ignore to come with a comment may be a
workable way to ensure &error_ignore remains a reliable "we mean it".
It may also tempt people to add more NULL arguments, because that's
easier.

Nothing stops us from add good comments to calls passing NULL.

> - It gives us a good way to measure progress of the audit by making
>   the removal of NULL a concrete goal.

Yes, this is far easier to measure than "has a good comment".

>                                        (Can we use coccinelle to find
>  all instances of the literal NULL being passed to a variable named
> errp?)

Maybe.

> From a brief chat on IRC, Markus is "reluctant to deviate from GError
> even more". It sounds like there isn't consensus here. We should 
> probably reach consensus on this point before trying to pass the task
> off to a neophyte, though -- so I'm raising this discussion on the
> list and CC'ing Markus to see if we can define the task better or not.

QEMU's Error preceded the adoption of GLib.  It is "loosely patterned
after Glib's GError" (quoting error.h's big comment).

We've discussed replacing Error by GError a couple of times.
Non-trivial due to differences between the two, in particular
&error_abort, &error_fatal, and error_propagate(()'s "multiple
propagations are fine, first one wins" vs. g_propagate_error()'s
"propagate at most once".

The latter difference seems rather pointless now, but removing it safely
would be a substantial amount of work[1].

&error_abort has been a clear win for us.  &error_fatal too, when used
judiciously.  Marc-André tried to get both into GLib, unsuccessfully[2].

Nevertheless, I'm reluctant to add more differeneces.  That's not a
"no".  It's a "you need to make a compelling case".

> --js
>
>
> (Personally, I've got no horse in the race beyond moving these tasks
> off the wiki and onto the tracker. Since I moved the issue, though, I
> might as well make sure the filing is accurate.)


[1] Converting old code to ERRP_GUARD() may reduce the use of
error_propagate() to a point where removing becomes easy.

[2] https://gitlab.gnome.org/GNOME/glib/-/issues/2288




reply via email to

[Prev in Thread] Current Thread [Next in Thread]