qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] Various questions about TCG implementation, DRM patches dealin


From: Peter Maydell
Subject: Re: [RFC] Various questions about TCG implementation, DRM patches dealing with pointers over guest-host barrier.
Date: Mon, 18 May 2020 11:21:51 +0100

On Mon, 18 May 2020 at 00:23, Catherine A. Frederick
<address@hidden> wrote:
> Hi, I've been patching TCG for my own purposes recently and I was
> wondering a few things. That being:
>
> - Is the TCG backend expected to handle bad cases for instructions? I
> was wondering as I found a situation where a very large shift constant
> reaches the backend and causes an illegal instruction to be generated.
> Is the frontend expected to clean this up, or is the backend supposed to
> be able to deal with these? I currently patched the bug via clipping the
> shift constant between 0 and 64.

The semantics of TCG IR ops are described in tcg/README. If the
behaviour described there isn't what the guest requires then
the frontend needs to emit code to handle special cases the way
the guest architecture says they're handled. If a backend doesn't
behave as the IR op spec requires then the backend is buggy. For shifts
in particular (shl/shr/sar/rotl/rotr) the IR op has "unspecified
behaviour" for out of range shift values: this is required to
not crash, but the result value is not specified and could be
anything. (The IR spec uses "undefined behaviour" to mean "could
do anything, including crashing. The reason shifts are made to
be only unspecified-behaviour is that it allows a frontend to
emit code that unconditionally does a shift and then uses movcond
or similar to only use the result in the case where the shift is
in range.)

> - I've been implementing an instruction scheduler(list scheduler, with
> priority given to most successors) for TCG and currently if I replace
> instructions in s->ops(the TCG context) I get a crash later in
> tcg_reg_alloc_op, even if the instruction stream is identical. Is there
> anything else I need to move when I do this?

This one's out of my field of knowledge; Richard might know.

> - Is insn_start necessary to have in order(and what does it do?)? These
> currently are serializing instructions in my scheduler and significantly
> limit my reordering as they create lots of dependencies every few
> instructions.

The primary purpose of insn_start is to save information about the
current instruction in a metadata area associated with the generated
TB, so that if the guest takes an unexpected exception (typically,
because a guest load or store faults) then we can restore the CPU
state to what it should be at the point of the fault. This is more
efficient than if we had to emit a "write new PC value to CPU state"
operation at the end of every guest insn. At runtime, at the point
when we determine that we need to generate a guest exception, we know
the host PC (inside the generated code) where the fault occurred.
We look up that PC value in the metadata to determine which guest
PC value that corresponds to [in accel/tcg/translate-all.c
cpu_restore_state_from_tb()], and use that to correctly set the
guest PC value and sometimes other state [by calling the frontend's
restore_state_to_opc() function, passing it the data that the
frontend handed to the insn_start opcode. (Eg on 32-bit arm we use
this mechanism to fix up the condexec bits, and to provide correct
values for the exception syndrome register.) If you wanted to be
able to reorder TCG ops across guest insn boundaries you'd need to
make the "guest-PC-to-insn-start-data" lookup handle that. You'd
also need to ensure that you don't reorder anything that updates
information visible to the guest (a visible effect of the insn)
before anything that could generate an exception in a preceding insn.

insn_start is also handy simply for debugging purposes -- in debug
disassembly output we indicate where instruction boundaries are in
the host generated code and in the TCG IR opcode dump, which makes
it easier to figure out what the generated code was trying to do.

Finally, I haven't checked, but I suspect the new TCG plugin APIs
implicitly assume that code for each insn is generated serially,
ie that a plugin can do "for each instruction" type work on a
callback that hangs off the insn_start op.

> - Is it "okay" to use g2h and h2g directly in code in syscall.c?
> Currently it seems like TYPE_PTRVOID doesn't do this conversion, and as
> such, most of the calls made over the guest-host barrier made by DRM
> seem to fail spectacularly across bittedness lines. I think a more ideal
> solution would be implementing types that do this automatically, so I
> don't have to deal with the difference in struct size using macros, but
> in the short term I don't really have another option.

Usually syscall.c code wants to use lock_user/lock_user_string/
lock_user_struct rather than directly using g2h/h2g. This is
important because the lock_user functions will check whether the
guest can actually access the memory, so that you can return a
suitable error (usually TARGET_EFAULT). Otherwise if the guest
hands you a buffer it doesn't have access to then your g2h-then-access
will probably make QEMU crash.

I think that the reason TYPE_PTRVOID is not converted within a struct
is that almost always the syscall.c wrapper for the ioctl will
need to actively do something to convert the data being pointed
to, which means it will need to call lock_user() on it, which means
it wants the guest address, not the host address. So an IOCTL
definition that uses TYPE_PTRVOID will be an IOCTL_SPECIAL()
which provides a C function to handle that conversion.
If the buffer being pointed to is literally just a byte buffer
of data that needs no conversion, I think you can specify that
with MK_PTR(TYPE_CHAR). If it's anything more complicated then
it's going to need an IOCTL_SPECIAL and a conversion function.

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]