[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 kvm/queue 05/16] KVM: Maintain ofs_tree for fast memslot l
From: |
Sean Christopherson |
Subject: |
Re: [PATCH v3 kvm/queue 05/16] KVM: Maintain ofs_tree for fast memslot lookup by file offset |
Date: |
Tue, 28 Dec 2021 21:48:08 +0000 |
On Fri, Dec 24, 2021, Chao Peng wrote:
> On Thu, Dec 23, 2021 at 06:02:33PM +0000, Sean Christopherson wrote:
> > On Thu, Dec 23, 2021, Chao Peng wrote:
> > > Similar to hva_tree for hva range, maintain interval tree ofs_tree for
> > > offset range of a fd-based memslot so the lookup by offset range can be
> > > faster when memslot count is high.
> >
> > This won't work. The hva_tree relies on there being exactly one virtual
> > address
> > space, whereas with private memory, userspace can map multiple files into
> > the
> > guest at different gfns, but with overlapping offsets.
>
> OK, that's the point.
>
> >
> > I also dislike hijacking __kvm_handle_hva_range() in patch 07.
> >
> > KVM also needs to disallow mapping the same file+offset into multiple gfns,
> > which
> > I don't see anywhere in this series.
>
> This can be checked against file+offset overlapping with existing slots
> when register a new one.
>
> >
> > In other words, there needs to be a 1:1 gfn:file+offset mapping. Since
> > userspace
> > likely wants to allocate a single file for guest private memory and map it
> > into
> > multiple discontiguous slots, e.g. to skip the PCI hole, the best idea off
> > the top
> > of my head would be to register the notifier on a per-slot basis, not a
> > per-VM
> > basis. It would require a 'struct kvm *' in 'struct kvm_memory_slot', but
> > that's
> > not a huge deal.
> >
> > That way, KVM's notifier callback already knows the memslot and can compute
> > overlap
> > between the memslot and the range by reversing the math done by
> > kvm_memfd_get_pfn().
> > Then, armed with the gfn and slot, invalidation is just a matter of
> > constructing
> > a struct kvm_gfn_range and invoking kvm_unmap_gfn_range().
>
> KVM is easy but the kernel bits would be difficulty, it has to maintain
> fd+offset to memslot mapping because one fd can have multiple memslots,
> it need decide which memslot needs to be notified.
No, the kernel side maintains an opaque pointer like it does today, KVM handles
reverse engineering the memslot to get the offset and whatever else it needs.
notify_fallocate() and other callbacks are unchanged, though they probably can
drop the inode.
E.g. likely with bad math and handwaving on the overlap detection:
int kvm_private_fd_fallocate_range(void *owner, pgoff_t start, pgoff_t end)
{
struct kvm_memory_slot *slot = owner;
struct kvm_gfn_range gfn_range = {
.slot = slot,
.start = (start - slot->private_offset) >> PAGE_SHIFT,
.end = (end - slot->private_offset) >> PAGE_SHIFT,
.may_block = true,
};
if (!has_overlap(slot, start, end))
return 0;
gfn_range.end = min(gfn_range.end, slot->base_gfn + slot->npages);
kvm_unmap_gfn_range(slot->kvm, &gfn_range);
return 0;
}
- [PATCH v3 kvm/queue 01/16] mm/shmem: Introduce F_SEAL_INACCESSIBLE, (continued)
- [PATCH v3 kvm/queue 01/16] mm/shmem: Introduce F_SEAL_INACCESSIBLE, Chao Peng, 2021/12/23
- [PATCH v3 kvm/queue 02/16] mm/memfd: Introduce MFD_INACCESSIBLE flag, Chao Peng, 2021/12/23
- [PATCH v3 kvm/queue 03/16] mm/memfd: Introduce MEMFD_OPS, Chao Peng, 2021/12/23
- [PATCH v3 kvm/queue 04/16] KVM: Extend the memslot to support fd-based private memory, Chao Peng, 2021/12/23
- [PATCH v3 kvm/queue 05/16] KVM: Maintain ofs_tree for fast memslot lookup by file offset, Chao Peng, 2021/12/23
[PATCH v3 kvm/queue 07/16] KVM: Refactor hva based memory invalidation code, Chao Peng, 2021/12/23
[PATCH v3 kvm/queue 08/16] KVM: Special handling for fd-based memory invalidation, Chao Peng, 2021/12/23
[PATCH v3 kvm/queue 06/16] KVM: Implement fd-based memory using MEMFD_OPS interfaces, Chao Peng, 2021/12/23
Re: [PATCH v3 kvm/queue 06/16] KVM: Implement fd-based memory using MEMFD_OPS interfaces, Chao Peng, 2021/12/23
Re: [PATCH v3 kvm/queue 06/16] KVM: Implement fd-based memory using MEMFD_OPS interfaces, Chao Peng, 2021/12/23