[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [question] Shall we flush ITS tables into guest RAM when shutdown th
From: |
Eric Auger |
Subject: |
Re: [question] Shall we flush ITS tables into guest RAM when shutdown the VM? |
Date: |
Tue, 6 Jul 2021 16:27:57 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 |
Hi Dave,
On 7/6/21 4:19 PM, Dr. David Alan Gilbert wrote:
> * Eric Auger (eric.auger@redhat.com) wrote:
>> Hi,
>>
>> On 7/6/21 10:18 AM, Kunkun Jiang wrote:
>>> Hi Eric,
>>>
>>> On 2021/6/30 17:16, Eric Auger wrote:
>>>> On 6/30/21 3:38 AM, Kunkun Jiang wrote:
>>>>> On 2021/6/30 4:14, Eric Auger wrote:
>>>>>> Hi Kunkun,
>>>>>>
>>>>>> On 6/29/21 11:33 AM, Kunkun Jiang wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Accroding to the patch cddafd8f353d2d251b1a5c6c948a577a85838582,
>>>>>>> our original intention is to flush the ITS tables into guest RAM at
>>>>>>> the point
>>>>>>> RUN_STATE_FINISH_MIGRATE, but sometimes the VM gets stopped before
>>>>>>> migration launch so let's simply flush the tables each time the VM
>>>>>>> gets
>>>>>>> stopped.
>>>>>>>
>>>>>>> But I encountered an error when I shut down the virtual machine.
>>>>>>>
>>>>>>>> qemu-system-aarch64: KVM_SET_DEVICE_ATTR failed: Group 4 attr
>>>>>>>> 0x0000000000000001: Permission denied
>>>>>>> Shall we need to flush ITS tables into guest RAM when 'shutdown' the
>>>>>>> VM?
>>>>>>> Or do you think this error is normal?
>>>>>> yes we determined in the past this was the right moment do save the
>>>>>> tables
>>>>>>
>>>>>> "with a live migration the guest is still running after
>>>>>> the RAM has been first saved, and so the tables may still change
>>>>>> during the iterative RAM save. You would actually need to do this
>>>>>> at just the point we stop the VM before the final RAM save; that
>>>>>> *might*
>>>>>> be possible by using a notifier hook a vm run state change to
>>>>>> RUN_STATE_FINISH_MIGRATE
>>>>>> - if you can do the changes just as the migration flips into that mode
>>>>>> it *should* work. " said David.
>>>>>>
>>>>>> But sometimes as the commit msg says, the VM is stopped before the
>>>>>> migration launch - I do not remember the exact scenario tbh -.
>>>>> Well, I initially wanted to know more about this scenario to determine
>>>>> whether
>>>>> a normal shutdown would fall into it.😂
>>>> I think it was for save/restore use case. In that case you need to flush
>>>> the KVM cache in memory on VM shutdown.
>>> Sorry for late reply.
>>>
>>> Can we distinguish from the 'RunState'?
>>> When we stop the VM, the RunState will be set. There are many types of
>>> RunState, such as RUN_STATE_FINISH_MIGRATE/RUN_STATE_SAVE_VM/
>>> RUN_STATE_SHUTDOWN/RUN_STATE_GUEST_PANICKED, etc.
>>>
>>> Maybe RUN_STATE_SHUTDOWN doesn't belong to save/restore use case,
>>> right?
>> Adding Dave, Juan and Peter in the loop for migration expertise.
>>
>> At the moment we save the ARM ITS MSI controller tables whenever the VM
>> gets stopped. Saving the tables from KVM caches into the guest RAM is
>> needed for migration and save/restore use cases.
>> However with GICv4 this fails at KVM level because some MSIs are
>> forwarded and saving their state is not supported with GICv4.
>>
>> While GICv4 migration is not supported we would like the VM to work
>> properly, ie. being stoppable without taking care of table saving.
>>
>> So could we be more precise and identifiy the save/restore and migration
>> use cases instead of saving the tables on each VM shutdown.
> During the precopy migration (not sure about others), we do:
>
> static void migration_completion(MigrationState *s)
> {
> ....
> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> ...
> ret = qemu_savevm_state_complete_precopy(s->to_dst_file,
> false,
> inactivate);
>
> so I think we do have that state there to hook off.
That's consistent with what you suggested in the past ans what is logged
in the commit message of
cddafd8f353d2d251b1a5c6c948a577a85838582 ("hw/intc/arm_gicv3_its: Implement
state save/restore").
However does the save/restore enters that state. If I remember correctly that's
why I decided to do the save on each VM stop instead.
>
>> The tables are saved into guest RAM so when need the CPUs and devices to
>> be stopped but we need the guest RAM to be saved after the ITS save
>> operation.
> Yeh so what should happen is that you:
> a) Iterate RAM a lot
> b) You stop everything
> -> Flushes remaining changes into RAM
> c) Transmit device state and last bits of RAM changes.
>
> so that flush should happen at (b).
That's correct.
Thanks
Eric
>
> Dave
>
>> Kunkun, by the way you currently just get an error from qemu, ie. qemu
>> does not exit? Couldn't we just ignore -EACCESS error?
>>
>> Thanks
>>
>> Eric
>>
>>
>>
>>
>>>>> In my opinion, when the virtual machine is normally shutdown, flushing
>>>>> the
>>>>> ITS tables is not necessary. If we can't tell the difference between
>>>>> 'normal shutdown'
>>>>> and the above scenario, then this 'error' is inevitable.
>>>>>> So each time the VM is stopped we flush the caches into guest RAM.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> This error occurs in the following scenario:
>>>>>>> Kunpeng 920 、enable GICv4、passthrough a accelerator Hisilicon SEC
>>>>>>> to
>>>>>>> the VM.
>>>>>>>
>>>>>>> The flow is as follows:
>>>>>>>
>>>>>>> QEMU:
>>>>>>> vm_shutdown
>>>>>>> do_vm_stop(RUN_STATE_SHUTDOWN)
>>>>>>> vm_state_notify
>>>>>>> ...
>>>>>>> vm_change_state_handler (hw/intc/arm_gicv3_its_kvm.c)
>>>>>>> kvm_device_access
>>>>>>>
>>>>>>> Kernel:
>>>>>>> vgic_its_save_tables_v0
>>>>>>> vgic_its_save_device_tables
>>>>>>> vgic_its_save_itt
>>>>>>>
>>>>>>> There is such a code in vgic_its_save_itt():
>>>>>>>> /*
>>>>>>>> * If an LPI carries the HW bit, this means that this
>>>>>>>> * interrupt is controlled by GICv4, and we do not
>>>>>>>> * have direct access to that state without GICv4.1.
>>>>>>>> * Let's simply fail the save operation...
>>>>>>>> */
>>>>>>>> if (ite->irq->hw && !kvm_vgic_global_state.has_gicv4_1)
>>>>>>>> return -EACCES;
>>>> Maybe we miss a piece of code for 4.0 that unsets the forwarding. The
>>>> only way to handle this is to make sure ite->irq->hw is not set on
>>>> shutdown, no?
>>> It's not going to return -EACCES here, if we unset the forwarding
>>> first. But
>>> this may cause problems in save/restore scenarios. The GICv4 architecture
>>> doesn't give any guarantee that the pending state is written into the
>>> pending table.
>>>
>>> Thanks,
>>> Kunkun Jiang
>>>> Thanks
>>>>
>>>> Eric
>>>>>> As far as I understand you need a v4.1 to migrate,
>>>>>> following Shenming's series
>>>>>> [PATCH v5 0/6] KVM: arm64: Add VLPI migration support on GICv4.1
>>>>>> Maybe sync with him?
>>>>> Yes, GICv4 does not support live migrate.
>>>>>
>>>>> Thanks,
>>>>> Kunkun Jiang
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>> Looking forward to your reply.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kunkun Jiang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> .
>>>> .
>>>