[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] approaches to 3D virtualisation
From: |
Paul Brook |
Subject: |
Re: [Qemu-devel] approaches to 3D virtualisation |
Date: |
Mon, 14 Dec 2009 12:03:38 +0000 |
User-agent: |
KMail/1.12.2 (Linux/2.6.31-1-amd64; KDE/4.3.4; x86_64; ; ) |
On Saturday 12 December 2009, Dave Airlie wrote:
> So I've been musing on the addition of some sort of 3D passthrough for
> qemu (as I'm sure have lots of ppl)
IIUC a typical graphics system consists of several operations:
1) Allocate space for data objects[2] on server[1].
2) Upload data from client to server
3) Issue data processing commands that manipulate (combine/modify) data
objects. For example a 3D rasterizer takes an image and sets of coordinates,
and writes pixels to an image.
4) display data object to user.
5) Read data back to client. In modern systems this should almost never
happen.
I'd expect this to be the same for both 2D and 3D subsystems. The only real
wart is that some 2D systems do not provide sufficient offload, and some
processing is still done by the guest CPU. This means (5) is common, and
you're effectively limited to a local implementation.
With remote rendering the main difference is that you have a relatively high
latency connection between client and server. If you have more than a few
round-trips per frame you probably aren't going to get acceptable performance.
IIUC this is why remote X connections perform so poorly, the protocol is
effectively synchronous so the client must wait for a response from the server
before sending the next command.
In practical terms this means that the state of the graphics pipeline should
not be guest visible. Considering the above pipeline, the only place where
guest state is visible is (5). I'd expect that this almost never happens in
normal circumstances. The fact the SLI/Crossfire setups can operate in AFR
mode supports this theory.
One prerequisite for isolating the graphics pipeline is that commands may not
fail. I guess this may require step (1) be a synchronous operation. However
steps (2), (3) and (4) should be fire-and-forget operations.
If step (2) is defined as completing any time between the issue of the upload
and the actual use then this allows both local zero-copy and remote explicit-
upload implementations.
A protocol that meets these requirements should be largely transport agnostic.
While a full paravirtual interface may be desirable to squeeze the last bits
of performance out, it should be possible to get acceptable performance over
e.g. TCP, in the same way that the main benefit of virtio block/net drivers is
simplicity and consistency rather than actual performance[3].
My understanding is that Chromium effectively implements the system described
above, and I guess the VirtualBox implementation is just a custom transport
backend and some modesetting tweaks. I have no specific knowledge of the
VMware implementation.
Once you have remote rendering the next problem is hotplug.
IMO transparently migrating state is not an realistic option. This effectively
required mirroring all of the server data on the guest. For source data (i.e.
textures) this is fairly trivial. However for intermediate images (think
redirected rendering of a 3D application window in a composited environment)
this is not feasible. You could try to record the commands used to generate
all intermediate data, however this also becomes infeasible. Command sequences
may be large, and the original source data may no longer be available.
Instead I suggest adding some sort of "damage" notification whereby the server
can inform the client that data objects have been lost. When hotplug
(switching to a different terminal) occurs we immediately complete all pending
commands and report that all objects have been lost. The guest should then re-
upload and regenerate as necessary and proceed to render the next frame. I'd
expect that clients already have logic to do this as part of the VRAM handling
for local video cards.
As long as the guest never tries to read back the image, a null implementation
is also trivial.
Obviously all this is predicated on having a virtual display driver in the
guest.
For simple framebuffer devices, and actual VGA hardware, our initial premise
that GPU state is not guest visible fails. In practice this means that there's
little scope for doing remote server size acceleration, and you're reduced to
implementing everything in the client and trying to optimize (2).
Paul
[1] I'm using X client/server terminology. The client is the guest OS and the
server is the user's terminal.
[2] Data objects include textures/bitmaps, vertex buffers, fragment programs,
and probably command buffers.
[3] Obviously if you emulate lame hardware like ne2k or IDE then performance
will suck. However emulation of a high-end NIC of SCSI HBA should get within
spitting distance of virtio.
Re: [Qemu-devel] approaches to 3D virtualisation, Mark Williamson, 2009/12/12
Re: [Qemu-devel] approaches to 3D virtualisation,
Paul Brook <=