qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/2] migration: prevent migration when a poisoned page is


From: Peter Xu
Subject: Re: [PATCH v4 2/2] migration: prevent migration when a poisoned page is unknown from the VM
Date: Mon, 16 Oct 2023 12:48:09 -0400

On Fri, Oct 13, 2023 at 03:08:39PM +0000, “William Roche wrote:
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index 5e95c496bb..e8db6380c1 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -1158,7 +1158,6 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, 
> void *addr)
>          ram_addr = qemu_ram_addr_from_host(addr);
>          if (ram_addr != RAM_ADDR_INVALID &&
>              kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
> -            kvm_hwpoison_page_add(ram_addr);
>              /*
>               * If this is a BUS_MCEERR_AR, we know we have been called
>               * synchronously from the vCPU thread, so we can easily
> @@ -1169,7 +1168,12 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, 
> void *addr)
>               * called synchronously from the vCPU thread, or a bit
>               * later from the main thread, so doing the injection of
>               * the error would be more complicated.
> +             * In this case, BUS_MCEERR_AO errors are unknown from the
> +             * guest, and we will prevent migration as long as this
> +             * poisoned page hasn't generated a BUS_MCEERR_AR error
> +             * that the guest takes into account.
>               */
> +            kvm_hwpoison_page_add(ram_addr, (code == BUS_MCEERR_AR));

I'm curious why ARM doesn't forward this event to guest even if it's AO.
X86 does it, and makes more sense to me.  Not familiar with arm, do you
know the reason?

I think this patch needs review from ARM and/or KVM side.  Do you want to
have the 1st patch merged, or rather wait for the whole set?

Another thing to mention: feel free to look at a recent addition of ioctl
from userfault, where it can inject poisoned ptes:

https://lore.kernel.org/r/20230707215540.2324998-1-axelrasmussen@google.com

I'm wondering if that'll be helpful to qemu too, where we can migrate
hwpoison_page_list and enforce the poisoning on dest.  Then even for AO
when accessed by guest it'll generated another MCE on dest.

>              if (code == BUS_MCEERR_AR) {
>                  kvm_cpu_synchronize_state(c);
>                  if (!acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)) {

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]