[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A bug of Monitor Chardev ?
From: |
Longpeng (Mike, Cloud Infrastructure Service Product Dept.) |
Subject: |
Re: A bug of Monitor Chardev ? |
Date: |
Tue, 25 May 2021 14:53:28 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
Hi Marc,
在 2021/5/22 0:59, Marc-André Lureau 写道:
> Hi
>
> On Fri, May 21, 2021 at 8:56 PM Daniel P. Berrangé <berrange@redhat.com
> <mailto:berrange@redhat.com>> wrote:
>
> On Fri, May 21, 2021 at 05:33:46PM +0100, Daniel P. Berrangé wrote:
> > On Fri, May 21, 2021 at 10:43:36AM -0400, Peter Xu wrote:
> > >
> > > I think the original problem was that if qemu_chr_fe_set_handlers() is
> called
> > > in main thread, it can start to race somehow within execution of the
> function
> > > qemu_chr_fe_set_handlers() right after we switch context at:
> > >
> > > qemu_chr_be_update_read_handlers(s, context);
> > >
> > > Then the rest code in qemu_chr_fe_set_handlers() will continue to run
> in
> main
> > > thread for sure, but the should be running with the new iothread
> context, which
> > > introduce a race condition.
> > >
> > > Running qemu_chr_be_update_read_handlers() in BH resolves that because
> then all
> > > things run in the monitor iothread only and natually serialized.
> >
> > The first message in this thread, however, claims that it is *not*
> > in fact serialized, when using the BH.
> >
> > > So the new comment looks indeed not fully right, as the chr device
> should be
> > > indeed within main thread context before qemu_chr_fe_set_handlers(),
> it's just
> > > that the race may start right away if without BH when context switch
> happens
> > > for the chr.
> >
> > It sounds like both the comment and the code are potentially wrong.
>
>
> I feel like our root cause problem that the original code was trying to
> workaround, is that the chardev is "active" from the very moment it is
> created, regardless of whether the frontend is ready to use it.
>
> IIUC, in this case the socket chardev is already listen()ing and
> accept()ing incoming clients off the network, before the monitor
> has finished configuring its hooks into the chardev. This means
> that the initial listen()/accept() I/O watches are using the
> default GMainContext, and the monitor *has* to remove them and
> put in new watches on the thread private GMainContext.
>
> To eliminate any risk of races, we need to make it possible for the
> monitor to configure the GMainContext on the chardevs *before* any
> I/O watches are configured.
>
> This in turn suggests that we need to split the chardev initialization
> into two phases. First we have the basic chardev creation, with object
> creation, option parsing/sanity checking, socket creation, and then
> second we have the actual activation where the I/O watches are added.
>
> IOW, qemu_chr_new() is the former and gets run from generic code in
> the main() method, or in QMP chardev_add. A new 'qemu_chr_activate'
> method would be called by whatever frontend is using the chardev,
> after registering a custom GMainContext.
>
> This would involve updating every single existing user of chardevs
> to add a call to qemu_chr_activate, but that's worth it to eliminate
> the race by design, rather than workaround it.
>
>
>
> What about my earlier suggestion to add a new "qemu_chr_be_disable_handlers()"
> (until update_read_handlers is called again to enable them and the set a
> different context)?
>
In this case, the BH calls the update_read_handlers, so the new added
"qemu_chr_be_disable_handlers" will be called in the monitor iothread BH ? If
so, I'm not sure whether it is safe enough, because the Chardev may still be
accessed in parallel by main loop and iothread for a while.
How about call "qemu_chr_be_disable_handlers" before set the
monitor_qmp_setup_handlers_bh ?
I think Daniel's soluation is perfect, but it's beyond my ability, I'm not
expert in Chardev/QMP, it's difficult to guarantee no other bugs will be
introduced, so we prefer to take the simplest and safest way to fix the bug in
our production.
>
> --
> Marc-André Lureau
- A bug of Monitor Chardev ?, Longpeng (Mike, Cloud Infrastructure Service Product Dept.), 2021/05/17
- Re: A bug of Monitor Chardev ?, Marc-André Lureau, 2021/05/19
- Re: A bug of Monitor Chardev ?, Daniel P . Berrangé, 2021/05/19
- Re: A bug of Monitor Chardev ?, Markus Armbruster, 2021/05/21
- Re: A bug of Monitor Chardev ?, Peter Xu, 2021/05/21
- Re: A bug of Monitor Chardev ?, Daniel P . Berrangé, 2021/05/21
- Re: A bug of Monitor Chardev ?, Daniel P . Berrangé, 2021/05/21
- Re: A bug of Monitor Chardev ?, Marc-André Lureau, 2021/05/21
- Re: A bug of Monitor Chardev ?, Daniel P . Berrangé, 2021/05/21
- Re: A bug of Monitor Chardev ?, Peter Xu, 2021/05/21
- Re: A bug of Monitor Chardev ?,
Longpeng (Mike, Cloud Infrastructure Service Product Dept.) <=
- Re: A bug of Monitor Chardev ?, Peter Xu, 2021/05/21
- Re: A bug of Monitor Chardev ?, Daniel P . Berrangé, 2021/05/21