qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v3 4/4] cpus: extract out accel-specific code to each accel


From: Roman Bolshakov
Subject: Re: [RFC v3 4/4] cpus: extract out accel-specific code to each accel
Date: Tue, 26 May 2020 22:16:00 +0300

On Mon, May 25, 2020 at 04:54:40PM +0200, Claudio Fontana wrote:
> each accelerator registers a new "CpusAccelInterface"
> on initialization, providing functions for starting a vcpu,
> kicking a vcpu, and sychronizing state.
> 
> This way the code in cpus.cc is now all general softmmu code,
> nothing (or almost nothing) accelerator-specific anymore.
> 
> Signed-off-by: Claudio Fontana <address@hidden>
> ---
>  MAINTAINERS                          |   1 +
>  accel/kvm/Makefile.objs              |   2 +
>  accel/kvm/kvm-all.c                  |  15 +-
>  accel/kvm/kvm-cpus-interface.c       |  94 ++++
>  accel/kvm/kvm-cpus-interface.h       |   8 +
>  accel/qtest.c                        |  82 ++++
>  accel/stubs/kvm-stub.c               |   3 +-
>  accel/tcg/Makefile.objs              |   1 +
>  accel/tcg/tcg-all.c                  |  12 +-
>  accel/tcg/tcg-cpus-interface.c       | 523 ++++++++++++++++++++
>  accel/tcg/tcg-cpus-interface.h       |   8 +
>  hw/core/cpu.c                        |   1 +
>  include/sysemu/cpus.h                |  32 ++
>  include/sysemu/hvf.h                 |   1 -
>  include/sysemu/hw_accel.h            |  57 +--
>  include/sysemu/kvm.h                 |   2 +-
>  softmmu/cpus.c                       | 911 
> +++--------------------------------
>  stubs/Makefile.objs                  |   1 +
>  stubs/cpu-synchronize-state.c        |  15 +
>  target/i386/Makefile.objs            |   7 +-
>  target/i386/hax-all.c                |   6 +-
>  target/i386/hax-cpus-interface.c     |  85 ++++
>  target/i386/hax-cpus-interface.h     |   8 +
>  target/i386/hax-i386.h               |   2 +
>  target/i386/hax-posix.c              |  12 +
>  target/i386/hax-windows.c            |  20 +
>  target/i386/hvf/Makefile.objs        |   2 +-
>  target/i386/hvf/hvf-cpus-interface.c |  92 ++++
>  target/i386/hvf/hvf-cpus-interface.h |   8 +
>  target/i386/hvf/hvf.c                |   5 +-
>  target/i386/whpx-all.c               |   3 +
>  target/i386/whpx-cpus-interface.c    |  96 ++++
>  target/i386/whpx-cpus-interface.h    |   8 +
>  33 files changed, 1220 insertions(+), 903 deletions(-)
>  create mode 100644 accel/kvm/kvm-cpus-interface.c
>  create mode 100644 accel/kvm/kvm-cpus-interface.h
>  create mode 100644 accel/tcg/tcg-cpus-interface.c
>  create mode 100644 accel/tcg/tcg-cpus-interface.h
>  create mode 100644 stubs/cpu-synchronize-state.c
>  create mode 100644 target/i386/hax-cpus-interface.c
>  create mode 100644 target/i386/hax-cpus-interface.h
>  create mode 100644 target/i386/hvf/hvf-cpus-interface.c
>  create mode 100644 target/i386/hvf/hvf-cpus-interface.h
>  create mode 100644 target/i386/whpx-cpus-interface.c
>  create mode 100644 target/i386/whpx-cpus-interface.h

Hi Claudio,

Overall it looks good.

I wonder if the new structure should get singular form, i.e.
softmmu/cpu.c instead of softmmu/cpus.c

Perhaps cpus.c had plural form because it was related to implementation
of multiple CPUs/accels. After the split, each accel got it's own
implementation of accel interface.

"<accel>-cpus-interface.c" contains implementation rather than interface
it's a bit confusing. Perhaps it should be called: "<accel>-cpu.c" or
even "<accel>-accel.c".

By the way, If we use registration for each accel, does it mean that
include/sysemu/<accel>.h and accel stubs are no longer needed in shared
location?

There's an AccelClass in accel/accel.c, I wonder if it should be re-used
for accel CPU registration? I don't know but may be generic the leftover
of cpus.c also belongs to accel/ rather than softmmu/?

>  
> diff --git a/include/sysemu/cpus.h b/include/sysemu/cpus.h
> index 149de000a0..cae22afe4d 100644
> --- a/include/sysemu/cpus.h
> +++ b/include/sysemu/cpus.h
> @@ -4,7 +4,39 @@
>  #include "qemu/timer.h"
>  
>  /* cpus.c */
> +
> +/* CPU execution threads */
> +
> +typedef struct CpusAccelInterface {
> +    void (*create_vcpu_thread)(CPUState *cpu);
> +    void (*kick_vcpu_thread)(CPUState *cpu);
> +
> +    void (*cpu_synchronize_post_reset)(CPUState *cpu);
> +    void (*cpu_synchronize_post_init)(CPUState *cpu);
> +    void (*cpu_synchronize_state)(CPUState *cpu);
> +    void (*cpu_synchronize_pre_loadvm)(CPUState *cpu);
> +} CpusAccelInterface;


I think plural name may be replaced to singular. Interface suffix
doesn't seem to be used in QEMU.

cpu_ and _vcpu are sort meaning the same and may be replaced to generic
cpu_ prefix. There's a CPUState, CPU<Arch>State, and IMO shorter
CPUAccel seems to match the naming. I also don't know if cpu_ prefix
should be kept.

So here's how I see the interface:

typedef struct CPUAccel {
    void (*create_thread)(CPUState *cpu);
    void (*kick_thread)(CPUState *cpu);

    void (*synchronize_post_reset)(CPUState *cpu);
    void (*synchronize_post_init)(CPUState *cpu);
    void (*synchronize_state)(CPUState *cpu);
    void (*synchronize_pre_loadvm)(CPUState *cpu);
} CPUAccel;


> +
> +/* register accel-specific interface */
> +void cpus_register_accel_interface(CpusAccelInterface *i);
> +
> +/* interface available for cpus accelerator threads */
> +
> +/* For temporary buffers for forming a name */
> +#define VCPU_THREAD_NAME_SIZE 16
> +
> +void cpus_kick_thread(CPUState *cpu);

If it's addressing a single CPU should it be singular too, i.e.
cpu_kick_thread?

> +bool cpu_thread_is_idle(CPUState *cpu);
>  bool all_cpu_threads_idle(void);
> +bool cpu_can_run(CPUState *cpu);
> +void qemu_wait_io_event_common(CPUState *cpu);
> +void qemu_wait_io_event(CPUState *cpu);
> +void cpu_thread_signal_created(CPUState *cpu);
> +void cpu_thread_signal_destroyed(CPUState *cpu);
> +void cpu_handle_guest_debug(CPUState *cpu);
> +
> +/* end interface for cpus accelerator threads */
> +
>  bool qemu_in_vcpu_thread(void);
>  void qemu_init_cpu_loop(void);
>  void resume_all_vcpus(void);
> diff --git a/target/i386/hvf/Makefile.objs b/target/i386/hvf/Makefile.objs
> index 927b86bc67..bdbc2c0227 100644
> --- a/target/i386/hvf/Makefile.objs
> +++ b/target/i386/hvf/Makefile.objs
> @@ -1,2 +1,2 @@
> -obj-y += hvf.o
> +obj-y += hvf.o hvf-cpus-interface.o
>  obj-y += x86.o x86_cpuid.o x86_decode.o x86_descr.o x86_emu.o x86_flags.o 
> x86_mmu.o x86hvf.o x86_task.o
> diff --git a/target/i386/hvf/hvf-cpus-interface.c 
> b/target/i386/hvf/hvf-cpus-interface.c
> new file mode 100644
> index 0000000000..54bfe307c1
> --- /dev/null
> +++ b/target/i386/hvf/hvf-cpus-interface.c
> @@ -0,0 +1,92 @@
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/main-loop.h"
> +#include "sysemu/hvf.h"
> +#include "sysemu/runstate.h"
> +#include "sysemu/cpus.h"
> +#include "qemu/guest-random.h"
> +
> +#include "hvf-cpus-interface.h"
> +
> +/*
> + * The HVF-specific vCPU thread function. This one should only run when the 
> host
> + * CPU supports the VMX "unrestricted guest" feature.
> + */
> +static void *hvf_cpu_thread_fn(void *arg)
> +{
> +    CPUState *cpu = arg;
> +
> +    int r;
> +
> +    assert(hvf_enabled());
> +
> +    rcu_register_thread();
> +
> +    qemu_mutex_lock_iothread();
> +    qemu_thread_get_self(cpu->thread);
> +
> +    cpu->thread_id = qemu_get_thread_id();
> +    cpu->can_do_io = 1;
> +    current_cpu = cpu;
> +
> +    hvf_init_vcpu(cpu);
> +
> +    /* signal CPU creation */
> +    cpu_thread_signal_created(cpu);
> +    qemu_guest_random_seed_thread_part2(cpu->random_seed);
> +
> +    do {
> +        if (cpu_can_run(cpu)) {
> +            r = hvf_vcpu_exec(cpu);
> +            if (r == EXCP_DEBUG) {
> +                cpu_handle_guest_debug(cpu);
> +            }
> +        }
> +        qemu_wait_io_event(cpu);
> +    } while (!cpu->unplug || cpu_can_run(cpu));
> +
> +    hvf_vcpu_destroy(cpu);
> +    cpu_thread_signal_destroyed(cpu);
> +    qemu_mutex_unlock_iothread();
> +    rcu_unregister_thread();
> +    return NULL;
> +}
> +
> +static void hvf_kick_vcpu_thread(CPUState *cpu)
> +{
> +    cpus_kick_thread(cpu);


I think we should leave it empty since it's not really implemented for
HVF. Here's a WIP implementation I'm yet to send:

https://github.com/roolebo/qemu/commit/4437a4b8af31b0adaa58375f09e0b8547507f64f#diff-5e6ad1baa140ca4a1743b5a2f1a664b5R970

> +}
> +
> +static void hvf_cpu_synchronize_noop(CPUState *cpu)
> +{
> +}
> +
> +static void hvf_start_vcpu_thread(CPUState *cpu)
> +{
> +    char thread_name[VCPU_THREAD_NAME_SIZE];
> +
> +    /*
> +     * HVF currently does not support TCG, and only runs in
> +     * unrestricted-guest mode.
> +     */
> +    assert(hvf_enabled());
> +
> +    cpu->thread = g_malloc0(sizeof(QemuThread));
> +    cpu->halt_cond = g_malloc0(sizeof(QemuCond));
> +    qemu_cond_init(cpu->halt_cond);
> +
> +    snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
> +             cpu->cpu_index);
> +    qemu_thread_create(cpu->thread, thread_name, hvf_cpu_thread_fn,
> +                       cpu, QEMU_THREAD_JOINABLE);
> +}
> +
> +CpusAccelInterface hvf_cpus_interface = {
> +    .create_vcpu_thread = hvf_start_vcpu_thread,

I think it was meant as hvf_create_vcpu_thread.

> +    .kick_vcpu_thread = hvf_kick_vcpu_thread,
> +
> +    .cpu_synchronize_post_reset = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_post_init = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_state = hvf_cpu_synchronize_noop,
> +    .cpu_synchronize_pre_loadvm = hvf_cpu_synchronize_noop,

I have tested the RFC with hvf accel and it seems to work although I
expected it fail. The noop functions are the reason of the behaviour. In
some sense it's not equivalent to what it was before but to be fair it's
not possible to use existing hvf_cpu_synchronize_* functions in
target/i386/hvf/hvf.c without breaking hvf.

Here's how it fails. hvf_cpu_synchronize_state() sets dirty flag to mark
that emulator registers in CPUX86State are going to be out of sync with
accel registers. If we bind .cpu_synchronize_state to
hvf_synchronize_state it'd be called from vapic_write() on I/O writes
(from kvmvapic.bin option ROM) to KVM vAPIC page (I/O port 0x7e).

But HVF uses two emulator states CPUX86State and HVFX86EmulatorState
and they may become out-of-sync. That's exactly what happens when an I/O
write to KVM vAPIC is made. The I/O write triggers cpu state
synchronization, so VMCS registers are fetched into CPUX86State and the
vcpu_dirty flag is set. Emulation code then advances RIP in the
macvm_set_rip() by modifying VMCS directly. On the next round of vcpu
run loop it checks if the cpu is dirty (it is) and pushes CPUX86State
into the VMCS, effectively overwriting what was set by macvm_set_rip().

If the RFC isn't expected to be merged for a couple of weeks I can
prepare a series that drops HVFX86EmulatorState. I need the change for
single-stepping support in HVF anyway.

Thanks,
Roman



reply via email to

[Prev in Thread] Current Thread [Next in Thread]