[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH V1 2/3] migration: fix suspended runstate
From: |
Steven Sistare |
Subject: |
Re: [PATCH V1 2/3] migration: fix suspended runstate |
Date: |
Wed, 21 Jun 2023 15:15:42 -0400 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 |
On 6/20/2023 5:46 PM, Peter Xu wrote:
> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
>> Migration of a guest in the suspended state is broken. The incoming
>> migration code automatically tries to wake the guest, which IMO is
>> wrong -- the guest should end migration in the same state it started.
>> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
>> bypasses vm_start(). The guest appears to be in the running state, but
>> it is not.
>>
>> To fix, leave the guest in the suspended state, but call
>> qemu_system_start_on_wakeup_request() so the guest is properly resumed
>> later, when the client sends a system_wakeup command.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> migration/migration.c | 11 ++++-------
>> softmmu/runstate.c | 1 +
>> 2 files changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 17b4b47..851fe6d 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
>> vm_start();
>> } else {
>> runstate_set(global_state_get_runstate());
>> + if (runstate_check(RUN_STATE_SUSPENDED)) {
>> + /* Force vm_start to be called later. */
>> + qemu_system_start_on_wakeup_request();
>> + }
>
> Is this really needed, along with patch 1?
>
> I have a very limited knowledge on suspension, so I'm prone to making
> mistakes..
>
> But from what I read this, qemu_system_wakeup_request() (existing one, not
> after patch 1 applied) will setup wakeup_reason and kick the main thread
> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in
> the main thread later on after qemu_wakeup_requested() returns true.
Correct, here:
if (qemu_wakeup_requested()) {
pause_all_vcpus();
qemu_system_wakeup();
notifier_list_notify(&wakeup_notifiers, &wakeup_reason);
wakeup_reason = QEMU_WAKEUP_REASON_NONE;
resume_all_vcpus();
qapi_event_send_wakeup();
}
However, that is not sufficient, because vm_start() was never called on the
incoming
side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among other
things.
Without my fixes, it "works" because the outgoing migration automatically wakes
a suspended
guest, which sets the state to running, which is saved in global state:
void migration_completion(MigrationState *s)
qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
global_state_store()
Then the incoming migration calls vm_start here:
migration/migration.c
if (!global_state_received() ||
global_state_get_runstate() == RUN_STATE_RUNNING) {
if (autostart) {
vm_start();
vm_start must be called for correctness.
- Steve
>> }
>> /*
>> * This must happen after any state changes since as soon as an external
>> @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms)
>> qemu_mutex_lock_iothread();
>> trace_postcopy_start_set_run();
>>
>> - qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>> global_state_store();
>> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>> if (ret < 0) {
>> @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s)
>> if (s->state == MIGRATION_STATUS_ACTIVE) {
>> qemu_mutex_lock_iothread();
>> s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>> - qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>>
>> s->vm_old_state = runstate_get();
>> global_state_store();
>> @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque)
>>
>> qemu_mutex_lock_iothread();
>>
>> - /*
>> - * If VM is currently in suspended state, then, to make a valid runstate
>> - * transition in vm_stop_force_state() we need to wakeup it up.
>> - */
>> - qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>
> Removal of these three places seems reasonable to me, or we won't persist
> the SUSPEND state.
>
> Above comment was the major reason I used to have thought it was needed
> (again, based on zero knowledge around this..), but perhaps it was just
> wrong? I would assume vm_stop_force_state() will still just work with
> suepended, am I right?
>
>> s->vm_old_state = runstate_get();
>>
>> global_state_store();
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index e127b21..771896c 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -159,6 +159,7 @@ static const RunStateTransition
>> runstate_transitions_def[] = {
>> { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>> { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>> { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>> + { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>> { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>> { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>>
>> --
>> 1.8.3.1
>>
>
- [PATCH V1 0/3] fix migration of suspended runstate, Steve Sistare, 2023/06/15
- [PATCH V1 1/3] vl: start on wakeup request, Steve Sistare, 2023/06/15
- [PATCH V1 2/3] migration: fix suspended runstate, Steve Sistare, 2023/06/15
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Peter Xu, 2023/06/20
- Re: [PATCH V1 2/3] migration: fix suspended runstate,
Steven Sistare <=
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Peter Xu, 2023/06/21
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Steven Sistare, 2023/06/23
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Steven Sistare, 2023/06/23
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Peter Xu, 2023/06/26
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Peter Xu, 2023/06/26
- Re: [PATCH V1 2/3] migration: fix suspended runstate, Steven Sistare, 2023/06/30
- [PATCH V1 3/3] tests/qtest: live migration suspended state, Steve Sistare, 2023/06/15