qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] vfio/nvlink: Remove exec permission to avoid SELinux AVC


From: Alexey Kardashevskiy
Subject: Re: [PATCH 1/1] vfio/nvlink: Remove exec permission to avoid SELinux AVCs
Date: Fri, 1 May 2020 16:44:24 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0


On 01/05/2020 15:54, Leonardo Bras wrote:
> If SELinux is setup without 'execmem' permission for qemu, all mmap
> with (PROT_WRITE | PROT_EXEC) will fail and print a warning in
> SELinux log.
> 
> If "nvlink2-mr" memory allocation fails (fist diff), it will cause
> guest NUMA nodes to not be correctly configured (V100 memory will
> not be visible for guest, nor its NUMA nodes).
> 
> Not having 'execmem' permission is intesting for virtual machines to
> avoid buffer-overflow based attacks, and it's adopted in distros
> like RHEL.
> 
> So, removing the PROT_EXEC flag seems the right thing to do.
> 
> Browsing some other code that mmaps memory for usage with
> memory_region_init_ram_device_ptr, I could notice it's usual to
> not have PROT_EXEC (only PROT_READ | PROT_WRITE), so it should be
> no problem around this.

> 
> Signed-off-by: Leonardo Bras <address@hidden>
> ---
>  hw/vfio/pci-quirks.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index 2d348f8237..124d4f57e1 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -1620,7 +1620,7 @@ int vfio_pci_nvidia_v100_ram_init(VFIOPCIDevice *vdev, 
> Error **errp)
>      }
>      cap = (void *) hdr;
>  
> -    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> +    p = mmap(NULL, nv2reg->size, PROT_READ | PROT_WRITE,



Unlike other users of memory_region_init_ram_device_ptr(), this memory
is more like a normal RAM, it is cache coherent and you can execute code
from it. I am unaware if this happens in practice but it easily could.

Having said that, I can see QEMU is not setting PROT_EXEC even for RAM
but KVM sets it anyway in a PTE in the partition scope tree (== guest
physical-to-host physical translation table) at [1] which is fine, I
just checked, with the change, the exec bit is still in the partition tree.


[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kvm/book3s_64_mmu_radix.c?h=v5.7-rc3#n857




>               MAP_SHARED, vdev->vbasedev.fd, nv2reg->offset);
>      if (p == MAP_FAILED) {
>          ret = -errno;
> @@ -1680,7 +1680,7 @@ int vfio_pci_nvlink2_init(VFIOPCIDevice *vdev, Error 
> **errp)
>  
>      /* Some NVLink bridges may not have assigned ATSD */
>      if (atsdreg->size) {
> -        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE | PROT_EXEC,
> +        p = mmap(NULL, atsdreg->size, PROT_READ | PROT_WRITE,


This chunk is ok as ATSD is a set of registers.



Reviewed-by: Alexey Kardashevskiy <address@hidden>



>                   MAP_SHARED, vdev->vbasedev.fd, atsdreg->offset);
>          if (p == MAP_FAILED) {
>              ret = -errno;
> 

-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]