qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 3/3] target/ppc: support single stepping


From: David Gibson
Subject: Re: [Qemu-devel] [RFC PATCH v2 3/3] target/ppc: support single stepping with KVM HV
Date: Sun, 2 Dec 2018 20:13:17 +1100
User-agent: Mutt/1.10.1 (2018-07-13)

On Fri, Nov 30, 2018 at 06:46:21PM -0200, Fabiano Rosas wrote:
> David Gibson <address@hidden> writes:
> 
> >> --- a/target/ppc/kvm.c
> >> +++ b/target/ppc/kvm.c
> >> @@ -94,6 +94,7 @@ static int cap_ppc_safe_indirect_branch;
> >>  static int cap_ppc_nested_kvm_hv;
> >>  
> >>  static uint32_t debug_inst_opcode;
> >> +static target_ulong trace_handler_addr;
> >>  
> >>  /* XXX We have a race condition where we actually have a level triggered
> >>   *     interrupt, but the infrastructure can't expose that yet, so the 
> >> guest
> >> @@ -509,6 +510,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >>      kvm_get_one_reg(cs, KVM_REG_PPC_DEBUG_INST, &debug_inst_opcode);
> >>      kvmppc_hw_debug_points_init(cenv);
> >>  
> >> +    trace_handler_addr = (cenv->excp_vectors[POWERPC_EXCP_TRACE] |
> >> +                          AIL_C000_0000_0000_4000_OFFSET);
> >
> > Effectively caching this as a global variable is pretty horrible.  A
> > function to make the above calculation when you need it would be
> > better.
> >
> > Also, won't the calculation above be wrong if the guest is using the
> > low AIL value?
> 
> Yes. As you suggested, I'll change this to calculate the offset based on AIL.
> 
> >>  void kvm_arch_set_singlestep(CPUState *cs, int enabled)
> >>  {
> >> +    PowerPCCPU *cpu = POWERPC_CPU(cs);
> >> +    CPUPPCState *env = &cpu->env;
> >> +
> >> +    if (kvmppc_is_pr(kvm_state)) {
> >
> > Explicitly testing for KVM PR is generally wrong (except as a
> > fallback).  Instead you should add a capability flag to KVM which you
> > use to test the feature's presence in qemu.  That capability can then
> > be set in HV, but not PR.
> 
> My apologies, I did read the notice on the code about it. I was trying
> to keep things simple for this RFC and forgot to mention that in the
> cover letter.

Just put a comment on it explaining that you expect to replace this
before the final version.

> >> +    if (arch_info->address == trace_handler_addr) {
> >> +        cpu_synchronize_state(cs);
> >> +        kvm_remove_breakpoint(cs, trace_handler_addr, 4, 
> >> GDB_BREAKPOINT_SW);
> >> +
> >> +        cpu_memory_rw_debug(cs, env->spr[SPR_SRR0] - 4, (uint8_t *)&insn,
> >> +                            sizeof(insn), 0);
> >> +
> >> +        /* If the last instruction was a mfmsr, make sure that the
> >> +         * MSR_SE bit is not set to avoid the guest kernel knowing
> >> +         * that it is being single-stepped */
> >> +        if (extract32(insn, 26, 6) == 31 && extract32(insn, 1, 10) == 83) 
> >> {
> >> +            reg = extract32(insn, 21, 5);
> >> +            env->gpr[reg] &= ~(1ULL << MSR_SE);
> >> +        }
> >
> > Hm.  What happens if both qemu and the guest itself attempt to single
> > step at the same time?  How do you distinguish between the cases?
> 
> There is currently no distinction being made.

This seems incorrect to me.  Basically you're restoring !MSR_SE
unconditionaly when you finish the hypervisor side step, which might
not be correct if the guest is also attempting to single step itself.
AFAICT it should be possible to track what the guest thinks is the
value of MSR_SE and restore that.  If both hypervisor and guest
attempt to single step, I'd expect to have the hypervisor trap first,
then return to the guest's single step vector.

> Since MSR_SE is only set
> for the duration of the step (see my answer below) this mostly just
> works and it is possible to single step inside/outside the guest while
> the control changes from one debugger to the other as expected. However
> there is a race as follows:
> 
> If the vcpu being stepped is preempted right after vm enter, with the
> breakpoint at 0xd00 already in place *and* the other debugger starts
> doing a step, it will cause a vm exit when it shouldn't:
> 
>     outside guest                  inside guest
> set breakpoint at 0xd00     |
> msr_se = 1                  |
> vm enter                    |
>                             |  msr_se = 1
>                             |  exec next instruction
>                             |  trace interrupt
>                             |  jump to 0xd00
>                             |  --- wrong from now on ---
>                             |  emulation assist interrupt
>                             |  vm exit
> 
> One way to avoid this race is by using GDB's `scheduler-locking step`
> option which effectively negates the issue by running only the vcpu
> being stepped.
> 
> Regardless of that workaround, I am exploring the idea of sending the
> cpu back into the guest if we happen to break at the trace interrupt
> handler while there is no cpu being stepped by QEMU
> (cs->singlestep_enabled). That way, a step inside the guest would never
> happen before the outside step has finished.
> 
> >
> >> +
> >> +        env->nip = env->spr[SPR_SRR0];
> >> +        env->msr = msr &= ~(1ULL << MSR_SE);
> >
> > You clear SE after tracing one instruction; is that intentional?
> 
> Yes, both the kernel and KVM PR do the same:
> 
> from arch/powerpc/kernel/traps.c
> void single_step_exception(struct pt_regs *regs)
> {
>         enum ctx_state prev_state = exception_enter();
> 
> ->      clear_single_step(regs);
>         ...
> }
> 
> from arch/powerpc/kvm/book3s_pr.c
> static int kvmppc_vcpu_run_pr(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
> {
>         ...
>         ret = __kvmppc_vcpu_run(kvm_run, vcpu);
> 
> ->      kvmppc_clear_debug(vcpu);
>         ...
> }
> 
> >
> >> +        cpu_synchronize_state(cs);
> >
> > You're effectively aborting the entry to the single step exception
> > vector; don't you need to set MSR to SRR1, not to the current MSR
> > value which will have been modified for the singlestep exception entry?
> 
> I'm using the old MSR value (before synchronizing) so the effect is the same:
> 
> (gdb) p/x $msr
> $2 = 0x8000000002803033
> (gdb) si
> 1892            LOAD_REG_ADDR(r11, replay_interrupt_return)
> (gdb) p/x $msr
> $3 = 0x8000000002803033
> 
> Using SRR1 would also work but it would require clearing the bits set by
> the interrupt (33:36, 44:47):
> 
> (gdb) p/x env->spr[0x01b]
> $1 = 0x8000000042803433
> 
> I could do the latter, if you prefer.

I think that's better - I don't think it's safe to assume that you're
*not* synchronized with KVM.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]