[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: hexagon: modeling a shared lock state
From: |
Brian Cain |
Subject: |
RE: hexagon: modeling a shared lock state |
Date: |
Thu, 25 Jan 2024 16:28:58 +0000 |
> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, January 24, 2024 6:22 PM
> To: Brian Cain <bcain@quicinc.com>; Philippe Mathieu-Daudé
> <philmd@linaro.org>
> Cc: qemu-devel@nongnu.org; Sid Manning <sidneym@quicinc.com>; Marco
> Liebel <mliebel@qti.qualcomm.com>; Matheus Bernardino
> <mathbern@qti.qualcomm.com>
> Subject: Re: hexagon: modeling a shared lock state
>
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
>
> On 1/25/24 01:07, Brian Cain wrote:
> > Philippe,
> >
> > For hexagon sysemu, while internally reviewing patches to be upstreamed we
> noticed that
> > our design for a lock instruction is not quite suitable. The k0lock
> > instruction
> will halt
> > if some other hexagon hardware CPU has already claimed the kernel lock,
> only to continue
> > executing once some CPU executes the unlock instruction. We modeled this
> using a lock
> > state enumeration member { OWNER, WAITING, UNLOCKED } in **each**
> vCPU and atomically
> > transitioning the lock required us to have vCPU[n] write the updated lock
> state to vCPU[m]
> > when unlocking.
> >
> > In context:
> >
> https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/he
> xagon/op_helper.c#L1790-L1821
> <https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/h
> exagon/op_helper.c#L1790-L1821>
> >
> > So instead, maybe it makes more sense to have a system device hold a single
> representation
> > of the lock’s state and each vCPU can do some kind of access via load/store
> and/or
> > interrupts to/from the device? I was thinking of raising an interrupt on
> > the
> lock device
> > at the vCPU’s k0lock instruction to indicate demand for the lock, and then
> > the
> device
> > could raise an interrupt to one of the vCPUs when it’s granted the lock.
> > One
> drawback for
> > this is that this might add significant latency to the uncontested lock
> > case. So
> I also
> > considered doing a load of some part of the lock device’s memory that could
> indicate
> > whether the lock was available, and if available it would claim it with a
> > store
> (both
> > ld/st while holding BQL). Only if unavailable it would halt and raise the
> interrupt. Is
> > it possible to create an address space that’s independent of the true system
> memory map
> > for this use case or would we need to carve out some memory from the
> existing memory map
> > for this synthetic device? Or – do you have a suggestion for a better
> approach overall?
>
> I think you are over-thinking this. A system device would just get in the
> way. I
Ok, great - our existing design is ~roughly like this. But we can explore this
-- thanks for writing this example!
> think
> you want something like
>
> struct CPUHexagonState {
> ...
> bool hwlock_pending;
> }
>
> hexagon_cpu_has_work() {
> if (cpu->hwlock_pending) {
> return false;
> }
> }
>
> static void do_hwlock(CPUHexagonState *env, bool *lock)
> {
> bql_lock();
>
> if (*lock) {
> env->hwlock_pending = true;
> cs->halted = true;
> cs->exception_index = EXCP_HALTED;
> bql_unlock();
> cpu_loop_exit(cs);
In place of the above - we have cpu_interrupt(cs, CPU_INTERRUPT_HALT) -- but is
that equivalent? Is there one idiom that's preferred over another? Somehow it
seems simpler if we don't need to longjmp and IIRC some of these patterns do.
> } else {
> *lock = true;
> bql_unlock();
> }
> }
>
> static void do_hwunlock(CPUHexagonState *env, bool *lock)
> {
> CPUState *cs;
> BQL_LOCK_GUARD();
>
> *lock = false;
>
> CPU_FOREACH(cs) {
> if (cs->hwlock_pending) {
> cs->hwlock_pending = false;
> qemu_cpu_kick(cs);
> }
> }
> }
>
> static bool k0lock, tlblock;
>
> void HELPER(k0lock)(CPUHexagonState *env)
> void HELPER(tlblock)(CPUHexagonState *env)
> void HELPER(k0unlock)(CPUHexagonState *env)
> void HELPER(tlbunlock)(CPUHexagonState *env)
>
> Open questions are:
>
> (1) Do interrupts cancel lock wait? Either way, placement in
> hexagon_cpu_has_work is key.
Yeah - they do, we will double check the interaction w has_work.
> (2) I assume that the pc will not be advanced, so that after the kick we will
> re-
> execute
> the hwlock instruction. There will be a thundering herd racing to grab the
> lock
> again,
> but it saves queuing logic, which might be complicated if interrupts are
> involved.
Yes that's right, too. Thanks!