qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 09/12] hw/cxl/events: Add qmp interfaces to add/release dy


From: Jonathan Cameron
Subject: Re: [PATCH v6 09/12] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
Date: Fri, 5 Apr 2024 18:44:52 +0100

On Fri, 5 Apr 2024 12:07:45 -0400
Gregory Price <gregory.price@memverge.com> wrote:

> On Fri, Apr 05, 2024 at 01:27:19PM +0100, Jonathan Cameron wrote:
> > On Wed, 3 Apr 2024 14:16:25 -0400
> > Gregory Price <gregory.price@memverge.com> wrote:
> > 
> > A few follow up comments.
> >   
> > >   
> > > > +        error_setg(errp, "no valid extents to send to process");
> > > > +        return;
> > > > +    }
> > > > +    
> > > 
> > > I'm looking at adding the MHD extensions around this point, e.g.:
> > > 
> > > /* If MHD cannot allocate requested extents, the cmd fails */
> > > if (type == DC_EVENT_ADD_CAPACITY && dcd->mhd_dcd_extents_allocate &&
> > >     num_extents != dcd->mhd_dcd_extents_allocate(...))
> > >   return;
> > > 
> > > where mhd_dcd_extents_allocate checks the MHD block bitmap and tags
> > > for correctness (shared // no double-allocations, etc). On success,
> > > it garuantees proper ownership.
> > > 
> > > the release path would then be done in the release response path from
> > > the host, as opposed to the release event injection.  
> > 
> > I think it would be polite to check if the QMP command on release
> > for whether it is asking something plausible - makes for an easier
> > to user QMP interface.  I guess it's not strictly required though.
> > What races are there on release?  
> 
> The only real critical section, barring force-release beign supported,
> is when you clear the bits in the device allowing new requests to swipe
> those blocks. The appropriate place appears to be after the host kernel
> has responded to the release extent request.

Agreed you can't release till then, but you can check if it's going to 
work.  I think that's worth doing for ease of use reasons.

> 
> Also need to handle the case of multiple add-requests contending for the
> same region, but that's just an "oops failed to get all the bits, roll
> back" scenario - easy to handle.
> 
> Could go coarse-grained to just lock access to the bitmap entirely while
> operating on it, or be fancy and use atomics to go lockless. The latter
> code already exists in the Niagara model for reference.

I'm fine either way, though I'd just use a lock in initial version()

> 
> > We aren't support force release
> > for now, and for anything else, it's host specific (unlike add where
> > the extra rules kick in).   AS such I 'think' a check at command
> > time will be valid as long as the host hasn't done an async
> > release of capacity between that and the event record.  That
> > is a race we always have and the host should at most log it and
> > not release capacity twice.
> >  
> 
> Borrowing from the Ira's flow chart, here are the pieces I believe are
> needed to implement MHD support for DCD.
> 
> Orchestrator      FM         Device       Host Kernel    Host User
> 
>     |             |           |            |              |
>     |-- Add ----->|-- Add --->A--- Add --->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<--Accept--B<--Accept --|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |              |
>     |             |     ... snip ...       |              |
>     |             |           |            |              |
>     |-- Remove -->|--Release->C---Release->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<-Release--D<--Release--|              |
>     |             |  Extent   |   Extent   |              |
>     |             |           |            |              |
> 
> 1. (A) Upon Device Receiving Add Capacity Request
>    a. the device sanity checks the request against local mappings
>    b. the mhd hook is called to sanity check against global mappings
>    c. the mhd bitmap is updated, marking the capacity owned by that head
> 
>    function: qmp_cxl_process_dynamic_capacity
> 
> 2. (B) Upon Device Receiving Add Dynamic Capacity Response
>    a. accepted extents are compared to the original request
>    b. not accepted extents are cleared from the bitmap (local and MHD)
>    (Note: My understanding is that for now each request = 1 extent)

Yeah but that is a restriction I think we need to solve soon.

> 
>    function: cmd_dcd_add_dyn_cap_rsp
> 
> 3. (C) Upon Device receiving Release Dynamic Capacity Request
>    a. check for a pending release request. If exists, error.

Not sure that's necessary - can queue as long as the head
can track if the bits are in a pending release state.

>    b. check that the bits in the MHD bitmap are actually set
Good.
> 
>    function: qmp_cxl_process_dynamic_capacity
> 
> 4. (D) Upon Device receiving Release Dynamic Capacity Response
>    a. clear the bits in the mhd bitmap
>    b. remove the pending request from the pending list
> 
>    function: cmd_dcd_release_dyn_cap
> 
> Something to note: The MHD bitmap is essentially the same as the
> existing DCD extent bitmap - except that it is located in a shared
> region of memory (mmap file, shm, whatever - pick one).

I think you will ideally also have a per head one to track head access
to the things offered by the mhd.

> 
> Maybe it's worth abstracting out the bitmap twiddling to make that
> backable by a file mmap'd SHARED and use atomics to twiddle the bits?
> 
> That would be about 90% of the way to MH-DCD.

> 
> Maybe flock() could be used for coarse locking on the a shared bitmap
> in the short term?  This mitigates your concern of using shm.h as
> the coordination piece, though i'm not sure how portable flock() is.
Sounds nice, but you are wondering into that pesky userspace stuff
where I'd have to google a lot to understand :)

Jonathan

> 
> ~Gregory




reply via email to

[Prev in Thread] Current Thread [Next in Thread]