qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 0/3] Fix MCE handling on AMD hosts


From: John Allen
Subject: Re: [PATCH v4 0/3] Fix MCE handling on AMD hosts
Date: Tue, 20 Feb 2024 11:27:03 -0600

On Wed, Feb 07, 2024 at 11:21:05AM +0000, Joao Martins wrote:
> On 12/09/2023 22:18, John Allen wrote:
> > In the event that a guest process attempts to access memory that has
> > been poisoned in response to a deferred uncorrected MCE, an AMD system
> > will currently generate a SIGBUS error which will result in the entire
> > guest being shutdown. Ideally, we only want to kill the guest process
> > that accessed poisoned memory in this case.
> > 
> > This support has been included in qemu for Intel hosts for a long time,
> > but there are a couple of changes needed for AMD hosts. First, we will
> > need to expose the SUCCOR cpuid bit to guests. Second, we need to modify
> > the MCE injection code to avoid Intel specific behavior when we are
> > running on an AMD host.
> > 
> 
> Is there any update with respect to this series?
> 
> John's series should fix MCE injection on AMD; as today it is just crashing 
> the
> guest (sadly) when an MCE happens in the hypervisor.
> 
> William, Paolo, I think the sort-of-dependency(?) of this where we block
> migration if there was a poisoned page on is already in Peter's migration
> tree[1] (CC'ed). So perhaps this series just needs John to resend it given 
> that
> it's been a couple months since v4?

It looks like this series still applies cleanly to latest qemu, but I
can resend if needed.

Thanks,
John

> 
> [1]
> https://lore.kernel.org/qemu-devel/20240130190640.139364-2-william.roche@oracle.com/
> 
> > v2:
> >   - Add "succor" feature word.
> >   - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
> > 
> > v3:
> >   - Reorder series. Only enable SUCCOR after bugs have been fixed.
> >   - Introduce new patch ignoring AO errors.
> > 
> > v4:
> >   - Remove redundant check for AO errors.
> > 
> > John Allen (2):
> >   i386: Fix MCE support for AMD hosts
> >   i386: Add support for SUCCOR feature
> > 
> > William Roche (1):
> >   i386: Explicitly ignore unsupported BUS_MCEERR_AO MCE on AMD guest
> > 
> >  target/i386/cpu.c     | 18 +++++++++++++++++-
> >  target/i386/cpu.h     |  4 ++++
> >  target/i386/helper.c  |  4 ++++
> >  target/i386/kvm/kvm.c | 28 ++++++++++++++++++++--------
> >  4 files changed, 45 insertions(+), 9 deletions(-)
> > 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]