[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device
From: |
Alexander Graf |
Subject: |
Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files |
Date: |
Wed, 28 Sep 2011 10:58:26 +0200 |
On 28.09.2011, at 04:40, Alex Williamson wrote:
> On Tue, 2011-09-27 at 16:28 -0500, Scott Wood wrote:
>> On 09/26/2011 07:45 PM, Alex Williamson wrote:
>>> On Mon, 2011-09-26 at 18:59 -0500, Scott Wood wrote:
>>>> On 09/26/2011 01:34 PM, Alex Williamson wrote:
>>>>> /* Reset the device */
>>>>> #define VFIO_DEVICE_RESET _IO(, ,)
>>>>
>>>> What generic way do we have to do this? We should probably have a way
>>>> to determine whether it's possible, without actually asking to do it.
>>>
>>> It's not generic, it could be a VFIO_DEVICE_PCI_RESET or we could add a
>>> bit to the device flags to indicate if it's available or we could add a
>>> "probe" arg to the ioctl to either check for existence or do it.
>>
>> Even with PCI, isn't this only possible if function-level reset is
>> supported?
>
> There are a couple other things we can do if FLR isn't present (D3hot
> transition, secondary bus reset, device specific resets are possible).
>
>> I think we need a flag.
>
> Ok, PCI has a pci_probe_reset_function() and pci_reset_function(). I'd
> probably mimic those in the vfio device ops. Common vfio code can probe
> the reset and set the flag appropriately and we can have a common
> VFIO_DEVICE_RESET ioctl that calls into the device ops reset function.
>
>> For devices that can't be reset by the kernel, we'll want the ability to
>> stop/start DMA acccess through the IOMMU (or other bus-specific means),
>> separate from whether the fd is open. If a device is assigned to a
>> partition and that partition gets reset, we'll want to disable DMA
>> before we re-use the memory, and enable it after the partition has reset
>> or quiesced the device (which requires the fd to be open).
>
> Maybe this can be accomplished via an iommu_detach_device() to
> temporarily disassociate it from the domain. We could also unmap all
> the DMA. Anyway, a few possibilities.
>
>>>>> /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
>>>>> #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int)
>>>>> #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int)
>>>>>
>>>>> Hope that covers it.
>>>>
>>>> It could be done this way, but I predict that the code (both kernel and
>>>> user side) will be larger. Maybe not much more complex, but more
>>>> boilerplate.
>>>>
>>>> How will you manage extensions to the interface?
>>>
>>> I would assume we'd do something similar to the kvm capabilities checks.
>>
>> This information is already built into the data-structure approach.
>
> If we define it to be part of the flags, then it's built-in to the ioctl
> approach too...
>
>>>> The table should not be particularly large, and you'll need to keep the
>>>> information around in some form regardless. Maybe in the PCI case you
>>>> could produce it dynamically (though I probably wouldn't), but it really
>>>> wouldn't make sense in the device tree case.
>>>
>>> It would be entirely dynamic for PCI, there's no advantage to caching
>>> it. Even for device tree, if you can't fetch it dynamically, you'd have
>>> to duplicate it between an internal data structure and a buffer reading
>>> the table.
>>
>> I don't think we'd need to keep the device tree path/index info around
>> for anything but the table -- but really, this is a minor consideration.
>>
>>>> You also lose the ability to easily have a human look at the hexdump for
>>>> debugging; you'll need a special "lsvfio" tool. You might want one
>>>> anyway to pretty-print the info, but with ioctls it's mandatory.
>>>
>>> I don't think this alone justifies duplicating the data and making it
>>> difficult to parse on both ends. Chances are we won't need such a tool
>>> for the ioctl interface because it's easier to get it right the first
>>> time ;)
>>
>> It's not just useful for getting the code right, but for e.g. sanity
>> checking that the devices were bound properly. I think such a tool
>> would be generally useful, no matter what the kernel interface ends up
>> being. I don't just use lspci to debug the PCI subsystem. :-)
>
> This is also a minor consideration. Looking at hexdumps isn't much to
> rely on for debugging and if we take the step of writing a tool, it's
> not much harder to write for either interface. The table is more akin
> to dumping the data, but I feel the ioctl is easier for how a driver
> would probably make use of the data (linear vs random access).
>
>>> Note that I'm not stuck on this interface, I was just thinking about how
>>> to generate the table last week, it seemed like a pain so I thought I'd
>>> spend a few minutes outlining an ioctl interface... turns out it's not
>>> so bad. Thanks,
>>
>> Yeah, it can work either way, as long as the information's there and
>> there's a way to add new bits of information, or new bus types, down the
>> road. Mainly a matter of aesthetics between the two.
>
> It'd be nice if David would chime back in since he objected to the
> table. Does an ioctl interface look better? Alex Graf, any opinions?
I'm honestly pretty indifferent on ioctl vs. linear read. I got the impression
that people dislike ioctls for whatever reason, so we went ahead and did the
design based on read(). With KVM, ioctls are a constant pain to extend, but so
are the constant sized fields here.
Whatever you do, please introduce a "flags" field to every struct you use and
add some padding at the end, so it can possibly be extended.
Alex
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, (continued)
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, David Gibson, 2011/09/26
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Alexander Graf, 2011/09/26
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Scott Wood, 2011/09/26
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Alex Williamson, 2011/09/26
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Scott Wood, 2011/09/27
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Alex Williamson, 2011/09/27
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files,
Alexander Graf <=
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, David Gibson, 2011/09/30
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, David Gibson, 2011/09/30
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, David Gibson, 2011/09/30
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Alex Williamson, 2011/09/30
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Alex Williamson, 2011/09/30
- Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, David Gibson, 2011/09/30
Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files, Stuart Yoder, 2011/09/26