[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 7/9] i386/sev: Refactor setting of reset vector and initial C
From: |
Roy Hopkins |
Subject: |
Re: [PATCH 7/9] i386/sev: Refactor setting of reset vector and initial CPU state |
Date: |
Tue, 12 Mar 2024 15:45:20 +0000 |
User-agent: |
Evolution 3.50.2 |
On Fri, 2024-03-01 at 17:01 +0000, Daniel P. Berrangé wrote:
> On Tue, Feb 27, 2024 at 02:50:13PM +0000, Roy Hopkins wrote:
> > When an SEV guest is started, the reset vector and state are
> > extracted from metadata that is contained in the firmware volume.
> >
> > In preparation for using IGVM to setup the initial CPU state,
> > the code has been refactored to populate vmcb_save_area for each
> > CPU which is then applied during guest startup and CPU reset.
> >
> > Signed-off-by: Roy Hopkins <roy.hopkins@suse.com>
> > ---
> > target/i386/sev.c | 288 +++++++++++++++++++++++++++++++++++++++++-----
> > target/i386/sev.h | 110 ++++++++++++++++++
> > 2 files changed, 369 insertions(+), 29 deletions(-)
> >
> > diff --git a/target/i386/sev.c b/target/i386/sev.c
> > index 173de91afe..d6902432fd 100644
> > --- a/target/i386/sev.c
> > +++ b/target/i386/sev.c
> > @@ -74,9 +74,7 @@ struct SevGuestState {
> > SevState state;
> > gchar *measurement;
> >
> > - uint32_t reset_cs;
> > - uint32_t reset_ip;
> > - bool reset_data_valid;
> > + QTAILQ_HEAD(, SevLaunchVmsa) launch_vmsa;
> > };
> >
> > #define DEFAULT_GUEST_POLICY 0x1 /* disable debug */
> > @@ -99,6 +97,12 @@ typedef struct QEMU_PACKED SevHashTableDescriptor {
> > /* hard code sha256 digest size */
> > #define HASH_SIZE 32
> >
> > +/* Convert between SEV-ES VMSA and SegmentCache flags/attributes */
> > +#define FLAGS_VMSA_TO_SEGCACHE(flags) \
> > + ((((flags) & 0xff00) << 12) | (((flags) & 0xff) << 8))
> > +#define FLAGS_SEGCACHE_TO_VMSA(flags) \
> > + ((((flags) & 0xff00) >> 8) | (((flags) & 0xf00000) >> 12))
> > +
> > typedef struct QEMU_PACKED SevHashTableEntry {
> > QemuUUID guid;
> > uint16_t len;
> > @@ -125,6 +129,15 @@ typedef struct QEMU_PACKED PaddedSevHashTable {
> > QEMU_BUILD_BUG_ON(sizeof(PaddedSevHashTable) % 16 != 0);
> >
> > static SevGuestState *sev_guest;
> > +
> > +typedef struct SevLaunchVmsa {
> > + QTAILQ_ENTRY(SevLaunchVmsa) next;
> > +
> > + uint16_t cpu_index;
> > + uint64_t gpa;
> > + struct sev_es_save_area vmsa;
> > +} SevLaunchVmsa;
> > +
> > static Error *sev_mig_blocker;
> >
> > static const char *const sev_fw_errlist[] = {
> > @@ -291,6 +304,149 @@ sev_guest_finalize(Object *obj)
> > {
> > }
> >
> > +static void sev_apply_cpu_context(CPUState *cpu)
> > +{
> > + SevGuestState *sev_guest = SEV_GUEST(MACHINE(qdev_get_machine())->cgs);
> > + X86CPU *x86;
> > + CPUX86State *env;
> > + struct SevLaunchVmsa *launch_vmsa;
> > +
> > + /* See if an initial VMSA has been provided for this CPU */
> > + QTAILQ_FOREACH(launch_vmsa, &sev_guest->launch_vmsa, next)
> > + {
> > + if (cpu->cpu_index == launch_vmsa->cpu_index) {
> > + x86 = X86_CPU(cpu);
> > + env = &x86->env;
> > +
> > + /*
> > + * Ideally we would provide the VMSA directly to kvm which
> > would
> > + * ensure that the resulting initial VMSA measurement which is
> > + * calculated during KVM_SEV_LAUNCH_UPDATE_VMSA is calculated
> > from
> > + * exactly what we provide here. Currently this is not possible
> > so
> > + * we need to copy the parts of the VMSA structure that we
> > currently
> > + * support into the CPU state.
> > + */
>
> This sounds like it is saying that the code is not honouring
> everything in the VMSA defiend by the IGVM file ?
>
> If so, that is pretty awkward. The VMSA is effectively an external
> ABI between QEMU and the guest owner (or whatever is validating
> guest attestation reports for them), and thus predictability and
> stability of this over time is critical.
>
> We don't want the attestation process to be dependent/variable on
> the particular QEMU/KVM version, because any upgrade to QEMU/KVM
> could then alter the effective VMSA that the guest owner sees.
>
> We've already suffered pain in this respect not long ago when the
> kernel arbitrarily changed a default setting which altered the
> VMSA it exposed, breaking existing apps that validate attestation.
>
> What will it take to provide the full VMSA to KVM, so that we can
> guarantee to the guest owner than the VMSA for the guest is going
> to perfectly match what their IGVM defined ?
>
Yes, the fact that we have to copy the individual fields from the VMSA to
"CPUX86State" is less than ideal - a problem made worse by the fact that the
kernel does not allow direct control over some of the fields from userspace,
"sev_features" being a good example here where "SVM_SEV_FEAT_DEBUG_SWAP" is
unconditionally added by the kernel.
The kernel VMSA is at least predictable. So, although we cannot yet allow full
flexibility in providing a complete VMSA from QEMU and guarantee it will be
honoured, we could check to see if any settings conflict with those imposed by
the kernel and exit with an error if this is the case. I chose not to implement
for this first series but could easily add a patch to support this. The problem
here is that it ties the version of QEMU to VMSA handling functionality in the
kernel. Any change to the VMSA handling in the kernel would potentially
invalidate the checks in QEMU. The one upside here is that this will easily be
detectable by the attestation measurement not matching the expected measurement
of the IGVM file. But it will be difficult for the user to determine what the
discrepancy is.
The ideal solution is to add or modify a KVM ioctl to allow the VMSA to be set
directly, overriding the state in "CPUX86State". The current
KVM_SEV_LAUNCH_UPDATE_VMSA ioctl triggers the synchronisation of the VMSA but
does not allow it to be specified directly. This could be modified for what we
need. The SEV-SNP kernel patches add KVM_SEV_SNP_LAUNCH_UPDATE which allows a
page type of VMSA to be updated, although the current patch series does not
support using this to set the initial state of the VMSA:
https://lore.kernel.org/lkml/20231230172351.574091-19-michael.roth@amd.com/ I
have experimented with this myself and have successfully modified the SEV-SNP
kernel patches to support directly setting the VMSA from QEMU.
On the other hand, I have also verified that I can indeed measure an IGVM file
loaded using the VMSA synchronisation method currently employed and get a
matching measurement from the SEV attestation report.
What would you suggest is the best way forward for this?
> > + cpu_load_efer(env, launch_vmsa->vmsa.efer);
> > + cpu_x86_update_cr4(env, launch_vmsa->vmsa.cr4);
> > + cpu_x86_update_cr0(env, launch_vmsa->vmsa.cr0);
> > + cpu_x86_update_cr3(env, launch_vmsa->vmsa.cr3);
> > +
> > + cpu_x86_load_seg_cache(
> > + env, R_CS, launch_vmsa->vmsa.cs.selector,
> > + launch_vmsa->vmsa.cs.base, launch_vmsa->vmsa.cs.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.cs.attrib));
> > + cpu_x86_load_seg_cache(
> > + env, R_DS, launch_vmsa->vmsa.ds.selector,
> > + launch_vmsa->vmsa.ds.base, launch_vmsa->vmsa.ds.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.ds.attrib));
> > + cpu_x86_load_seg_cache(
> > + env, R_ES, launch_vmsa->vmsa.es.selector,
> > + launch_vmsa->vmsa.es.base, launch_vmsa->vmsa.es.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.es.attrib));
> > + cpu_x86_load_seg_cache(
> > + env, R_FS, launch_vmsa->vmsa.fs.selector,
> > + launch_vmsa->vmsa.fs.base, launch_vmsa->vmsa.fs.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.fs.attrib));
> > + cpu_x86_load_seg_cache(
> > + env, R_GS, launch_vmsa->vmsa.gs.selector,
> > + launch_vmsa->vmsa.gs.base, launch_vmsa->vmsa.gs.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.gs.attrib));
> > + cpu_x86_load_seg_cache(
> > + env, R_SS, launch_vmsa->vmsa.ss.selector,
> > + launch_vmsa->vmsa.ss.base, launch_vmsa->vmsa.ss.limit,
> > + FLAGS_VMSA_TO_SEGCACHE(launch_vmsa->vmsa.ss.attrib));
> > +
> > + env->gdt.base = launch_vmsa->vmsa.gdtr.base;
> > + env->gdt.limit = launch_vmsa->vmsa.gdtr.limit;
> > + env->idt.base = launch_vmsa->vmsa.idtr.base;
> > + env->idt.limit = launch_vmsa->vmsa.idtr.limit;
> > +
> > + env->regs[R_EAX] = launch_vmsa->vmsa.rax;
> > + env->regs[R_ECX] = launch_vmsa->vmsa.rcx;
> > + env->regs[R_EDX] = launch_vmsa->vmsa.rdx;
> > + env->regs[R_EBX] = launch_vmsa->vmsa.rbx;
> > + env->regs[R_ESP] = launch_vmsa->vmsa.rsp;
> > + env->regs[R_EBP] = launch_vmsa->vmsa.rbp;
> > + env->regs[R_ESI] = launch_vmsa->vmsa.rsi;
> > + env->regs[R_EDI] = launch_vmsa->vmsa.rdi;
> > +#ifdef TARGET_X86_64
> > + env->regs[R_R8] = launch_vmsa->vmsa.r8;
> > + env->regs[R_R9] = launch_vmsa->vmsa.r9;
> > + env->regs[R_R10] = launch_vmsa->vmsa.r10;
> > + env->regs[R_R11] = launch_vmsa->vmsa.r11;
> > + env->regs[R_R12] = launch_vmsa->vmsa.r12;
> > + env->regs[R_R13] = launch_vmsa->vmsa.r13;
> > + env->regs[R_R14] = launch_vmsa->vmsa.r14;
> > + env->regs[R_R15] = launch_vmsa->vmsa.r15;
> > +#endif
> > + env->eip = launch_vmsa->vmsa.rip;
> > + break;
> > + }
> > + }
> > +}
>
>
> With regards,
> Daniel
Regards,
Roy