[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v3 04/19] Add device IO ops vector
From: |
John Johnson |
Subject: |
Re: [RFC v3 04/19] Add device IO ops vector |
Date: |
Tue, 7 Dec 2021 07:48:33 +0000 |
> On Nov 19, 2021, at 2:42 PM, Alex Williamson <alex.williamson@redhat.com>
> wrote:
>
> On Mon, 8 Nov 2021 16:46:32 -0800
> John Johnson <john.g.johnson@oracle.com> wrote:
>
>>
>> + int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t
>> size,
>> + void *data, bool post);
>
> The @post arg is not really developed in this patch, it would make
> review easier to add the arg when it becomes implemented and used.
>
ok
>>
>> @@ -2382,6 +2398,21 @@ int vfio_get_region_info(VFIODevice *vbasedev, int
>> index,
>> struct vfio_region_info **info)
>> {
>> size_t argsz = sizeof(struct vfio_region_info);
>> + int fd = -1;
>> + int ret;
>> +
>> + /* create region cache */
>> + if (vbasedev->regions == NULL) {
>> + vbasedev->regions = g_new0(struct vfio_region_info *,
>> + vbasedev->num_regions);
>> + }
>
> Seems like this should have been part of vfio_get_device() since that's
> where we set num_regions.
>
>> + /* check cache */
>> + if (vbasedev->regions[index] != NULL) {
>> + *info = g_malloc0(vbasedev->regions[index]->argsz);
>> + memcpy(*info, vbasedev->regions[index],
>> + vbasedev->regions[index]->argsz);
>> + return 0;
>> + }
>
> Why are we calling vfio_get_region_info() enough that we need a cache?
> This looks like an optimization for something that doesn't exist yet.
> And if we have a cache, why aren't callers using it directly rather
> than creating a copy for each caller?
>
The are a couple types of vfio_get_region_info() callers.
One type wants a specific region, which it then uses to fill in
it’s own data structures (e.g., vfio_region_setup()) and the other
type uses vfio_get_dev_region_info() to loop through all regions
looking for a specific one (e.g., for the migration region or for
a specifc device to setup its quirks) On top of that, each
vfio_get_region_info() will need to call the server twice for regions
with capabilities: once to get the proper ‘argsz’ and once to actually
get the info.
I thought just having a cache that gets filled once, then
satisfies all later callers was a cleaner idea. It also solved the
issue I had that the FD returned isn't used until the container is
setup, which would’ve needed another vfio_get_region_info() call.
I can certainly move the cache fill to vfio_get_device()
and remove the copies if you like that idea better.
>>
>> *info = g_malloc0(argsz);
>>
>> @@ -2389,19 +2420,28 @@ int vfio_get_region_info(VFIODevice *vbasedev, int
>> index,
>> retry:
>> (*info)->argsz = argsz;
>>
>> - if (ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, *info)) {
>> + ret = VDEV_GET_REGION_INFO(vbasedev, *info, &fd);
>> + if (ret != 0) {
>> g_free(*info);
>> *info = NULL;
>> - return -errno;
>> + return ret;
>> }
>>
>> if ((*info)->argsz > argsz) {
>> argsz = (*info)->argsz;
>> *info = g_realloc(*info, argsz);
>> + if (fd != -1) {
>> + close(fd);
>> + fd = -1;
>> + }
>
> Use of fd here is hard to review, it's clearly for some future use case.
>
The FD code can be moved to a patch where it’s used.
>>
>> +
>> +static int vfio_io_region_write(VFIODevice *vbasedev, uint8_t index, off_t
>> off,
>> + uint32_t size, void *data, bool post)
>> +{
>> + struct vfio_region_info *info = vbasedev->regions[index];
>> + int ret;
>> +
>> + ret = pwrite(vbasedev->fd, data, size, info->offset + off);
>> + if (ret < 0) {
>> + ret = -errno;
>> + }
>> +
>> + return ret;
>> +}
>
> Simplify all these with ternaries?
>
> return ret < 0 ? -errno : ret;
>
ok
>>
>> static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>> {
>> + VFIODevice *vbasedev = &vdev->vbasedev;
>> uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK);
>> - off_t offset = vdev->config_offset + PCI_ROM_ADDRESS;
>> DeviceState *dev = DEVICE(vdev);
>> char *name;
>> - int fd = vdev->vbasedev.fd;
>>
>> if (vdev->pdev.romfile || !vdev->pdev.rom_bar) {
>> /* Since pci handles romfile, just print a message and return */
>> @@ -926,13 +928,23 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>> * Use the same size ROM BAR as the physical device. The contents
>> * will get filled in later when the guest tries to read it.
>> */
>> - if (pread(fd, &orig, 4, offset) != 4 ||
>> - pwrite(fd, &size, 4, offset) != 4 ||
>> - pread(fd, &size, 4, offset) != 4 ||
>> - pwrite(fd, &orig, 4, offset) != 4) {
>> - error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name);
>> +#define rom_read(var) VDEV_REGION_READ(vbasedev, \
>> + VFIO_PCI_CONFIG_REGION_INDEX, \
>> + PCI_ROM_ADDRESS, 4, (var))
>> +#define rom_write(var) VDEV_REGION_WRITE(vbasedev, \
>> + VFIO_PCI_CONFIG_REGION_INDEX, \
>> + PCI_ROM_ADDRESS, 4, (var), false)
>> +
>> + if (rom_read(&orig) != 4 ||
>> + rom_write(&size) != 4 ||
>> + rom_read(&size) != 4 ||
>> + rom_write(&orig) != 4) {
>> +
>> + error_report("%s(%s) ROM access failed", __func__, vbasedev->name);
>> return;
>> }
>> +#undef rom_read
>> +#undef rom_write
>
> Hmm, I think I'd rather see a generic function broken out for this,
> maybe:
>
> int vfio_pci_size_bar32(VFIODevice *vbasedev, int offset,
> uint32_t mask, uint32_t *size)
> {
> uint32_t orig, val;
> int index = VFIO_PCI_CONFIG_REGION_INDEX;
>
> if (VDEV_REGION_READ(vbasedev, index, offset, 4, &orig) != 4 ||
> VDEV_REGION_WRITE(vbasedev, index, offset, 4, &mask, false) != 4 ||
> VDEV_REGION_READ(vbasedev, index, offset, 4, &val) != 4 ||
> VDEV_REGION_WRITE(vbasedev, index, offset, 4, &orig, false) != 4) {
>
> error_report(...
>
> return -1;
> }
>
> *size = ~(le32_to_cpu(val) & PCI_ROM_ADDRESS_MASK) + 1;
>
> return 0;
> }
>
ok
> Really I think it's just trying to hard code
> VFIO_PCI_CONFIG_REGION_INDEX and PCI_ROM_ADDRESS into the macro calls
> that demand some simplification, if we use variables it looks fine
> inline.
>
> We could also re-wrap VDEV_REGION_READ/WRITE in vfio/pci.h to simplify
> config space access, presumably we'd never have posted writes to config
> space. Thanks,
>
There are multiple places where config space is read or written
in pci.c, so adding an inline or #define for it is a good idea.
JJ
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [RFC v3 04/19] Add device IO ops vector,
John Johnson <=