|
From: | Arnabjyoti Kalita |
Subject: | Re: How does QEMU in TCG mode handle interrupts ? |
Date: | Sun, 13 Sep 2020 09:41:06 +0530 |
On Tue, 25 Aug 2020 at 06:53, Arnabjyoti Kalita
<akalita@cs.stonybrook.edu> wrote:
> This makes sense. In this scenario, when QEMU takes an interrupt at the end of a TB, I understand that the TB execution will not happen.
When QEMU takes an interrupt at the *end* of a TB, that TB
has already executed. (Really we take interrupts between
TBs, so at the end of one and the start of the next.)
To be a bit more precise, the way this works is that there's
a loop in cpu_exec():
while (!cpu_handle_exception(...)) {
while (!cpu_handle_interrupt(...)) {
tb = tb_find(...);
cpu_loop_exec_tb(...);
}
}
where cpu_loop_exec_tb() will start the TB, which might then
chain forward to the next TB without returning to this C loop.
So the first bit of generated code in each TB checks a flag
to see if there's a pending interrupt that means it should
just exit back to the C code loop without executing the TB
(we'll then call cpu_handle_interrupt() to process it).
> The interrupt will be taken and then the same TB will be
> re-translated again and later executed, right ?
We won't re-translate a TB just because it returned without
executing. We will likely re-execute it later after the guest
returns from the interrupt, but that depends on guest behaviour.
(There is no special casing for this: we just implement "return
from interrupt means we set the guest PC and other registers
as the guest CPU architecture specifies"; then when we find
the first TB after that, it will turn out to be the same TB
we abandoned execution of, if the guest made the return-from-interrupt
go back to the same place it left off. But if the guest did
something clever, eg OS context switch after interrupt handler
runs, then the guest PC will be something else and we'll execute
from there.)
> If so, does this methodology apply for all kinds of interrupts,
> hardware/software/faults/traps/vmexits etc. ?
You can divide these things into three categories:
* things which affect the emulated CPU and which are
asynchronously triggered by other parts of the system --
most notably, device interrupts. These are handled as I
describe above, with flags that tell the guest CPU to stop
executing at the next convenient point so it can deal with
the interrupt. QEMU calls all these things "interrupts"
in its generic code.
* "expected" exceptions: things like software traps where
a syscall instruction, for instance, will always cause an
exception. These we handle by having the generated code
for the instruction just be "raise an exception". (This is
generally 'write the current PC etc back to the CPU state
struct, call a C helper which sets some state about what
kind of exception this is and then calls cpu_loop_exit(),
which does a siglongjmp() out to cpu_exec() so it can do
the "cpu_handle_exception()" call immediately'.) We also
handle cases like "undefined instruction" and "tried to
exec an FPU instruction but the FPU is disabled" this way.
You can also deal with "usually happens" exceptions
like this too -- if a helper you call turns out not to
raise an exception, that's fine, it just means you updated
the PC unnecessarily, which is slightly inefficient but not
a big deal in most cases.
* "unexpected" exceptions: the primary example is a fault on
a data load/store insn. This is similar to the "expected"
case, except that we want to avoid the overhead of updating
the CPU state struct with the current PC because almost always
loads and stores succeed and don't take exceptions, and
loads and stores are really common so we care about avoiding
the extra insns that update the PC. So instead, when QEMU
detects that a load/store failed it will call cpu_restore_state().
This uses the *host* PC where the fault was detected (which
will be somewhere inside the generated code for the TB),
plus metadata that was generated and stored alongside the TB
when we first translated it, to identify "for this host PC,
what is the guest PC (plus any other per-target info we need)",
and then we can fix up the CPU state using that info. Once
the CPU state has been fixed up we call cpu_loop_exit() and
the rest is like the "expected" exception case.
In all this I have been ignoring the 'icount' feature, which
when enabled does cause us sometimes to stop 'halfway' through
a TB and re-translate just the first part of it, in order
to cope with I/O. If you care about the details of that you can
look at docs/devel/tcg-icount.rst.
thanks
-- PMM
[Prev in Thread] | Current Thread | [Next in Thread] |