[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Suggestions for TCG performance improvements

From: Vasilev Oleg
Subject: Suggestions for TCG performance improvements
Date: Thu, 2 Dec 2021 09:47:13 +0000

Hi everyone,

I've recently been tasked with improving QEMU performance and would like
to discuss several possible optimizations which we could implement and
later upstream.

We ran the sysbench[1] tool in threads mode on a linux installed as
an aarch64 guest on x86_64 host. The QEMU profile flamegraph is attached
to this message. One of the conclusions is that refilling TLB takes
a substantial amount of time, and we are thinking of some solutions to
abstain from refilling TLB so often.

I've discovered some MMU-related suggestions in the 2018 letter[2], and
those seem to be still not implemented (flush still uses memset[3]).
Do you think we should go forward with implementing those?

The mentioned paper[4] also describes other possible improvements.
Some of those are already implemented (such as victim TLB and dynamic
size for TLB), but others are not (e.g. TLB lookup uninlining and
set-associative TLB layer). Do you think those improvements
worth trying?

Another idea for decreasing occurence of TLB refills is to make TBs key
in htable independent of physical address. I assume it is only needed
to distinguish different processes where VAs can be the same.
Is that assumption correct?

Do you have any other ideas which parts of TCG could require our
attention w.r.t the flamegraph I attached?

I am also CCing my teammates. We are eager to improve the QEMU TCG
performance for our needs and to contribute our patches to upstream.

[1]: https://github.com/akopytov/sysbench
[2]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg562103.html
[4]: https://dl.acm.org/doi/pdf/10.1145/2686034

Attachment: flamegraph.svg
Description: flamegraph.svg

Attachment: callgraph.svg
Description: callgraph.svg

reply via email to

[Prev in Thread] Current Thread [Next in Thread]