qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] monitor: increase amount of data for monitor to


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH] monitor: increase amount of data for monitor to read
Date: Fri, 05 Mar 2021 17:13:14 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 05.03.2021 17:10, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>> 
>>> 24.11.2020 18:44, Vladimir Sementsov-Ogievskiy wrote:
>>>> Hi all!
>>>>
>>>> This patch isn't outdated, it applies on master with a little conflict 
>>>> atomic_mb_read -> qatomic_mb_read
>>>>
>>>> We have new series "[PATCH v2 0/2] Increase amount of data for monitor to 
>>>> read", but I do think that we've started from wrong point. And actually we 
>>>> should start from this old patch.
>>>>
>>>> 10.07.2019 19:36, Dr. David Alan Gilbert wrote:
>>>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>>>> Did this fall through the cracks?
>>>>>>
>>>>>> Denis Plotnikov <dplotnikov@virtuozzo.com> writes:
>>>>>>
>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, 
>>>>>>> which
>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>
>>>>>> Yes, reading one byte at a time is awful.  But QMP is control plane; I
>>>>>> didn't expect it to impact system performance.  How are you using QMP?
>>>>>> Just curious, not actually opposed to improving QMP efficiency.
>>>>>>
>>>>>>> This patch changes the amount of data to read to 4096 bytes, which 
>>>>>>> matches
>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>>>
>>>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>>>> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
>>>>>>> ---
>>>>>>>    include/monitor/monitor.h | 2 +-
>>>>>>>    monitor.c                 | 2 +-
>>>>>>>    2 files changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
>>>>>>> index c1b40a9cac..afa1ed34a4 100644
>>>>>>> --- a/include/monitor/monitor.h
>>>>>>> +++ b/include/monitor/monitor.h
>>>>>>> @@ -14,7 +14,7 @@ extern __thread Monitor *cur_mon;
>>>>>>>    #define MONITOR_USE_CONTROL   0x04
>>>>>>>    #define MONITOR_USE_PRETTY    0x08
>>>>>>> -#define QMP_REQ_QUEUE_LEN_MAX 8
>>>>>>> +#define QMP_REQ_QUEUE_LEN_MAX 4096
>>>>>>
>>>>>> This looks suspicious.  It's a request count, not a byte count.  Can you
>>>>>> explain what led you to change it this way?
>>>>
>>>> I can explain, because that was my idea:
>>>>
>>>> It's a hack to not overflow the queue. With just the second hunk we 
>>>> overflow it and assertion fails.
>> 
>> I can't see offhand how that happens.  Got a reproducer for me?
>
> Yes:
>
>
> [root@kvm master]# echo '{'execute': 'qmp_capabilities'}{'execute': 
> 'query-block'}{'execute': 'query-block'}{'execute': 'query-block'}{'execute': 
> 'query-block'}{'execute': 'query-block'}{'execute': 'query-block'}{'execute': 
> 'query-block'}{'execute': 'query-block'}{'execute': 'query-block'}{'execute': 
> 'query-block'}{'execute': 'query-block'}{'execute': 'query-block'}' | 
> ./build/qemu-system-x86_64 -nodefaults -nographic -S -qmp stdio
> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 5}, 
> "package": "v5.2.0-2359-g0b9242f339-dirty"}, "capabilities": ["oob"]}}
> qemu-system-x86_64: ../monitor/qmp.c:400: handle_qmp_command: Assertion 
> `mon->qmp_requests->length < QMP_REQ_QUEUE_LEN_MAX' failed.
> Aborted (core dumped)
>
>
> [root@kvm master]# git diff
>
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index e94f532cf5..41679241b3 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -562,7 +562,7 @@ int monitor_can_read(void *opaque)
>   {
>       Monitor *mon = opaque;
>   
> -    return !qatomic_mb_read(&mon->suspend_cnt);
> +    return !qatomic_mb_read(&mon->suspend_cnt) ? 4096 : 0;
>   }
>   
>   void monitor_list_append(Monitor *mon)

Hmmm...

We always parse the everything we read.

We suspend the monitor when the request queue is full, and resume it has
space.

When we read one character at a time, this suffices: when input is ready
while the monitor is not suspended, we read one character, feed it to
the parser, which produces at most one request.  This request surely
fits into the queue, because the queue is non-full (or else the monitor
would be suspended).

When we read a sane buffer, suspending the monitor doesn't suffice.  We
need to suspend parsing, too.  Feels feasible.

[...]
>>>>>>>    bool monitor_cur_is_qmp(void);
>>>>>>> diff --git a/monitor.c b/monitor.c
>>>>>>> index 4807bbe811..a08e020b61 100644
>>>>>>> --- a/monitor.c
>>>>>>> +++ b/monitor.c
>>>>>>> @@ -4097,7 +4097,7 @@ static int monitor_can_read(void *opaque)
>>>>>>>    {
>>>>>>>        Monitor *mon = opaque;
>>>>>>> -    return !atomic_mb_read(&mon->suspend_cnt);
>>>>>>> +    return !atomic_mb_read(&mon->suspend_cnt) ? 4096 : 0;
>>>>>>>    }
>>>>>>>    /*
>>>>>>
>>>>>> The ramifications are not obvious to me.  I think I need to (re-)examine
>>>>>> how QMP reads input, with special consideration to its OOB feature.
>>>>>
>>>>> Yeh that was the bit that worried me; I also wondered what happens with
>>>>> monitor_suspend and things like fd passing; enough to make it
>>>>> non-obvious to me.
>>>>>
>>>>
>>>> OK, I don't have answers..
>>>>
>>>> Markus, David, could you please help? Or, what to do? We of course don't 
>>>> have enough expertise to prove that patch will not break any feature in 
>>>> the whole monitor subsystem.
>>>>
>>>> I can say that it just works, and we live with it for years (and our 
>>>> customers too), and tests pass. Still, I don't think that we use OOB 
>>>> feature. I remember some problems with it, when RHEL version which we were 
>>>> based on included some OOB feature patches, but not some appropriate 
>>>> fixes.. But it was long ago.
>>>>
>>>> If nobody can say that patch is wrong, maybe, we can just apply it? We'll 
>>>> have the whole release cycle to catch any bugs and revert the commit at 
>>>> any time. In this way we only have a question about QMP_REQ_QUEUE_LEN_MAX, 
>>>> which is quite simple.
>>>>
>>>
>>>
>>> ping here, as a replacement for "[PATCH v3 0/5] Increase amount of data for 
>>> monitor to read"
>>>
>>> If no better idea, I'll make what I propose above (with two limits) at some 
>>> good moment :)
>> 
>> Digest what I wrote above, then tell me how you'd like to proceed.
>> 
>
> Ohh. Honestly, now I don't have time to dig into monitor code and prove that 
> everything is good. It's OK for me to postpone it for some future time.

I don't require *proof*.  I do need a bit more help than "this patch
works for us".

Weak correctness arguments can sometimes be offset by good tests.

One obvious testing gap is receiving many commands in a single batch.
The existing tests don't cover that, because they send one command after
the other.  Evicence: the crash discussed above.

How much effort you can contribute is of course entirely up to you.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]