[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3] s390x/tod: Properly stop the KVM TOD while t
From: |
Thomas Huth |
Subject: |
Re: [Qemu-devel] [PATCH v3] s390x/tod: Properly stop the KVM TOD while the guest is not running |
Date: |
Tue, 4 Dec 2018 09:54:36 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
On 2018-11-30 10:49, David Hildenbrand wrote:
> Just like on other architectures, we should stop the clock while the guest
> is not running. This is already properly done for TCG. Right now, doing an
> offline migration (stop, migrate, cont) can easily trigger stalls in the
> guest.
>
> Even doing a
> (hmp) stop
> ... wait 2 minutes ...
> (hmp) cont
> will already trigger stalls.
>
> So whenever the guest stops, backup the KVM TOD. When continuing to run
> the guest, restore the KVM TOD.
>
> One special case is starting a simple VM: Reading the TOD from KVM to
> stop it right away until the guest is actually started means that the
> time of any simple VM will already differ to the host time. We can
> simply leave the TOD running and the guest won't be able to recognize
> it.
>
> For migration, we actually want to keep the TOD stopped until really
> starting the guest. To be able to catch most errors, we should however
> try to set the TOD in addition to simply storing it. So we can still
> catch basic migration problems.
>
> If anything goes wrong while backing up/restoring the TOD, we have to
> ignore it (but print a warning). This is then basically a fallback to
> old behavior (TOD remains running).
>
> I tested this very basically with an initrd:
> 1. Start a simple VM. Observed that the TOD is kept running. Old
> behavior.
> 2. Ordinary live migration. Observed that the TOD is temporarily
> stopped on the destination when setting the new value and
> correctly started when finally starting the guest.
> 3. Offline live migration. (stop, migrate, cont). Observed that the
> TOD will be stopped on the source with the "stop" command. On the
> destination, the TOD is temporarily stopped when setting the new
> value and correctly started when finally starting the guest via
> "cont".
> 4. Simple stop/cont correctly stops/starts the TOD. (multiple stops
> or conts in a row have no effect, so works as expected)
>
> In the future, we might want to send the guest a special kind of time sync
> interrupt under some conditions, so it can synchronize its tod to the
> host tod. This is interesting for migration scenarios but also when we
> get time sync interrupts ourselves. This however will most probably have
> to be handled in KVM (e.g. when the tods differ too much) and is not
> desired e.g. when debugging the guest. (single stepping should not
> result in permanent time syncs). I consider something like that an add-on
> on top of this basic "don't break the guest" handling.
>
> Signed-off-by: David Hildenbrand <address@hidden>
> ---
>
> v2 -> v3:
> - use device_class_set_parent_realize() to implement a child realize
> function
>
> hw/s390x/tod-kvm.c | 102 ++++++++++++++++++++++++++++++++++++++++-
> include/hw/s390x/tod.h | 8 +++-
> 2 files changed, 107 insertions(+), 3 deletions(-)
LGTM now.
Reviewed-by: Thomas Huth <address@hidden>