[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 10/12] python/machine.py: split shutdown into hard and sof
From: |
Philippe Mathieu-Daudé |
Subject: |
Re: [PATCH v5 10/12] python/machine.py: split shutdown into hard and soft flavors |
Date: |
Tue, 14 Jul 2020 21:10:09 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 |
On 7/14/20 8:13 PM, John Snow wrote:
>
>
> On 7/14/20 12:13 AM, Cleber Rosa wrote:
>> On Fri, Jul 10, 2020 at 01:06:47AM -0400, John Snow wrote:
>>> This is done primarily to avoid the 'bare except' pattern, which
>>> suppresses all exceptions during shutdown and can obscure errors.
>>>
>>> Replace this with a pattern that isolates the different kind of shutdown
>>> paradigms (_hard_shutdown and _soft_shutdown), and a new fallback shutdown
>>> handler (_do_shutdown) that gracefully attempts one before the other.
>>>
>>> This split now also ensures that no matter what happens,
>>> _post_shutdown() is always invoked.
>>>
>>> shutdown() changes in behavior such that if it attempts to do a graceful
>>> shutdown and is unable to, it will now always raise an exception to
>>> indicate this. This can be avoided by the test writer in three ways:
>>>
>>> 1. If the VM is expected to have already exited or is in the process of
>>> exiting, wait() can be used instead of shutdown() to clean up resources
>>> instead. This helps avoid race conditions in shutdown.
>>>
>>> 2. If a test writer is expecting graceful shutdown to fail, shutdown
>>> should be called in a try...except block.
>>>
>>> 3. If the test writer has no interest in performing a graceful shutdown
>>> at all, kill() can be used instead.
>>>
>>>
>>> Handling shutdown in this way makes it much more explicit which type of
>>> shutdown we want and allows the library to report problems with this
>>> process.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>> python/qemu/machine.py | 95 +++++++++++++++++++++++++++++++++++-------
>>> 1 file changed, 80 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
>>> index aaa173f046..b24ce8a268 100644
>>> --- a/python/qemu/machine.py
>>> +++ b/python/qemu/machine.py
>>> @@ -48,6 +48,12 @@ class QEMUMachineAddDeviceError(QEMUMachineError):
>>> """
>>>
>>>
>>> +class AbnormalShutdown(QEMUMachineError):
>>> + """
>>> + Exception raised when a graceful shutdown was requested, but not
>>> performed.
>>> + """
>>> +
>>> +
>>> class MonitorResponseError(qmp.QMPError):
>>> """
>>> Represents erroneous QMP monitor reply
>>> @@ -365,6 +371,7 @@ def _early_cleanup(self) -> None:
>>> """
>>> Perform any cleanup that needs to happen before the VM exits.
>>>
>>> + May be invoked by both soft and hard shutdown in failover
>>> scenarios.
>>> Called additionally by _post_shutdown for comprehensive cleanup.
>>> """
>>> # If we keep the console socket open, we may deadlock waiting
>>> @@ -374,32 +381,90 @@ def _early_cleanup(self) -> None:
>>> self._console_socket.close()
>>> self._console_socket = None
>>>
>>> + def _hard_shutdown(self) -> None:
>>> + """
>>> + Perform early cleanup, kill the VM, and wait for it to terminate.
>>> +
>>> + :raise subprocess.Timeout: When timeout is exceeds 60 seconds
>>> + waiting for the QEMU process to terminate.
>>> + """
>>> + self._early_cleanup()
>>
>> Like I commented on patch 5, I don't think the *current* type of
>> cleanup done is needed on a scenario like this...
>>
>>> + self._popen.kill()
>>
>> ... as I don't remember QEMU's SIGKILL handler to be susceptible to
>> the race condition that motivated the closing of the console file in
>> the first place. But, I also can not prove it's not susceptible at
>> this time.
>>
>
> It probably isn't. It was for consistency in when and where that hook is
> called, again. It does happen to be "pointless" here.
>
>> Note: I have some old patches that added tests for QEMUMachine itself.
>> I intend to respin them on top of your work, so we may have a clearer
>> understanding of the QEMU behaviors we need to handle. So, feel free
>> to take the prudent route here, and keep the early cleanup.
>>
>
> Oh, adding formal tests to this folder would be incredible; especially
> if we wanted to package it on PyPI. Basically a necessity.
>
>> Reviewed-by: Cleber Rosa <crosa@redhat.com>
>> Tested-by: Cleber Rosa <crosa@redhat.com>
>>
>
Queuing with this hunk from jsnow:
-- >8 --
@@ -455,6 +444,9 @@ def shutdown(self, has_quit: bool = False,
Terminate the VM (gracefully if possible) and perform cleanup.
Cleanup will always be performed.
+ If the VM has not yet been launched, or shutdown(), wait(), or
kill()
+ have already been called, this method does nothing.
+
:param has_quit: When true, do not attempt to issue 'quit' QMP
command.
:param hard: When true, do not attempt graceful shutdown, and
suppress the SIGKILL warning log message.
---
[PATCH v5 12/12] python/machine.py: change default wait timeout to 3 seconds, John Snow, 2020/07/10
[PATCH v5 11/12] python/machine.py: re-add sigkill warning suppression, John Snow, 2020/07/10
Re: [PATCH v5 00/12] python/machine.py: refactor shutdown, Philippe Mathieu-Daudé, 2020/07/14