[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-commits] [qemu/qemu] b255b2: util: add cacheinfo
From: |
GitHub |
Subject: |
[Qemu-commits] [qemu/qemu] b255b2: util: add cacheinfo |
Date: |
Thu, 22 Jun 2017 03:33:58 -0700 |
Branch: refs/heads/master
Home: https://github.com/qemu/qemu
Commit: b255b2c8a5484742606e8760870ba3e14d0c9605
https://github.com/qemu/qemu/commit/b255b2c8a5484742606e8760870ba3e14d0c9605
Author: Emilio G. Cota <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M include/qemu/osdep.h
M tcg/ppc/tcg-target.inc.c
M util/Makefile.objs
A util/cacheinfo.c
Log Message:
-----------
util: add cacheinfo
Add helpers to gather cache info from the host at init-time.
For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.
Suggested-by: Richard Henderson <address@hidden>
Suggested-by: Geert Martin Ijewski <address@hidden>
Tested-by: Geert Martin Ijewski <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
[rth: Move all implementations from tcg/ppc/]
Signed-off-by: Richard Henderson <address@hidden>
Commit: 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2
https://github.com/qemu/qemu/commit/6e3b2bfd6af488a896f7936e99ef160f8f37e6f2
Author: Emilio G. Cota <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M include/exec/tb-context.h
M tcg/tcg.c
M tcg/tcg.h
M translate-all.c
Log Message:
-----------
tcg: allocate TB structs before the corresponding translated code
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.
An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).
Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:
https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
Subject: Re: GSoC 2017 Proposal: TCG performance enhancements
Message-ID: <address@hidden>
Suggested-by: Richard Henderson <address@hidden>
Reviewed-by: Pranith Kumar <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
[rth: Simplify the arithmetic in tcg_tb_alloc]
Signed-off-by: Richard Henderson <address@hidden>
Commit: 2b48e10f888059a98043b4816769fa2a326a1d2c
https://github.com/qemu/qemu/commit/2b48e10f888059a98043b4816769fa2a326a1d2c
Author: Emilio G. Cota <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M translate-all.c
Log Message:
-----------
translate-all: consolidate tb init in tb_gen_code
We are partially initializing tb in tb_alloc. Instead, fully
initialize it in tb_gen_code, which is tb_alloc's only caller.
This saves an unnecessary write to tb->cflags.
Signed-off-by: Emilio G. Cota <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: cc74d332ff9a78684374847375ef63fc4bd10436
https://github.com/qemu/qemu/commit/cc74d332ff9a78684374847375ef63fc4bd10436
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M tcg/aarch64/tcg-target.inc.c
Log Message:
-----------
tcg/aarch64: Use ADR in tcg_out_movi
The new placement of the TB means that we can use one insn
to load the return value for exit_tb returning the TB pointer.
Tested-by: Emilio G. Cota <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: 3fb53fb4d12f2e7833bd1659e6013237b130ef20
https://github.com/qemu/qemu/commit/3fb53fb4d12f2e7833bd1659e6013237b130ef20
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M include/exec/exec-all.h
M tcg/arm/tcg-target.inc.c
Log Message:
-----------
tcg/arm: Use indirect branch for goto_tb
Signed-off-by: Richard Henderson <address@hidden>
Commit: acb0b292b6d0f49972dc98f742e79ed53973e438
https://github.com/qemu/qemu/commit/acb0b292b6d0f49972dc98f742e79ed53973e438
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M translate-all.c
Log Message:
-----------
tcg/arm: Remove limit on code buffer size
Since we're no longer using a direct branch, we have no
limit on the branch distance.
Signed-off-by: Richard Henderson <address@hidden>
Commit: 9c39b94f1448770e7e573e9516d2483816785d1b
https://github.com/qemu/qemu/commit/9c39b94f1448770e7e573e9516d2483816785d1b
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M tcg/arm/tcg-target.inc.c
Log Message:
-----------
tcg/arm: Try pc-relative addresses for movi
Signed-off-by: Richard Henderson <address@hidden>
Commit: 308714e6bc945389c64faf1b9213e2c0d3f03391
https://github.com/qemu/qemu/commit/308714e6bc945389c64faf1b9213e2c0d3f03391
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M tcg/arm/tcg-target.inc.c
Log Message:
-----------
tcg/arm: Use ldr (literal) for goto_tb
The new placement of the TB means that we can use one insn
to load the goto_tb destination directly from the TB.
Signed-off-by: Richard Henderson <address@hidden>
Commit: b97a879de980e99452063851597edb98e7e8039c
https://github.com/qemu/qemu/commit/b97a879de980e99452063851597edb98e7e8039c
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M tcg-runtime.c
Log Message:
-----------
tcg: Increase hit rate of lookup_tb_ptr
We can call tb_htable_lookup even when the tb_jmp_cache is completely
empty. Therefore, un-nest most of the code dependent on tb != NULL
from the read from the cache.
This improves the hit rate of lookup_tb_ptr; for instance, when booting
and immediately shutting down debian-arm, the hit rate improves from
93.2% to 99.4%.
Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Emilio G. Cota <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: 54e1d4ed1dcdae27b8c02575c155c26434579485
https://github.com/qemu/qemu/commit/54e1d4ed1dcdae27b8c02575c155c26434579485
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M target/alpha/translate.c
Log Message:
-----------
target/alpha: Use tcg_gen_lookup_and_goto_ptr
Tested-by: Emilio G. Cota <address@hidden>
Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: 542f70c22edd22367373b4cb34d3c478f1ac7c0f
https://github.com/qemu/qemu/commit/542f70c22edd22367373b4cb34d3c478f1ac7c0f
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M target/s390x/translate.c
Log Message:
-----------
target/s390x: Exit after changing PSW mask
Exit to cpu loop so we reevaluate cpu_s390x_hw_interrupts.
Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: 8da54b2507c1cabf60c2de904cf0383b23239231
https://github.com/qemu/qemu/commit/8da54b2507c1cabf60c2de904cf0383b23239231
Author: Richard Henderson <address@hidden>
Date: 2017-06-19 (Mon, 19 Jun 2017)
Changed paths:
M target/arm/translate-a64.c
Log Message:
-----------
target/arm: Exit after clearing aarch64 interrupt mask
Exit to cpu loop so we reevaluate cpu_arm_hw_interrupts.
Tested-by: Emilio G. Cota <address@hidden>
Tested-by: Alex Bennée <address@hidden>
Reviewed-by: Emilio G. Cota <address@hidden>
Reviewed-by: Alex Bennée <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>
Commit: db7a99cdc1d0f4d8cbf7c41ce9e570dce04f0a11
https://github.com/qemu/qemu/commit/db7a99cdc1d0f4d8cbf7c41ce9e570dce04f0a11
Author: Peter Maydell <address@hidden>
Date: 2017-06-22 (Thu, 22 Jun 2017)
Changed paths:
M accel/tcg/translate-all.c
M include/exec/exec-all.h
M include/exec/tb-context.h
M include/qemu/osdep.h
M target/alpha/translate.c
M target/arm/translate-a64.c
M target/s390x/translate.c
M tcg/aarch64/tcg-target.inc.c
M tcg/arm/tcg-target.inc.c
M tcg/ppc/tcg-target.inc.c
M tcg/tcg-runtime.c
M tcg/tcg.c
M tcg/tcg.h
M util/Makefile.objs
A util/cacheinfo.c
Log Message:
-----------
Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170619' into staging
Queued TCG patches
# gpg: Signature made Mon 19 Jun 2017 19:12:06 BST
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <address@hidden>"
# gpg: aka "Richard Henderson <address@hidden>"
# gpg: aka "Richard Henderson <address@hidden>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-tcg-20170619:
target/arm: Exit after clearing aarch64 interrupt mask
target/s390x: Exit after changing PSW mask
target/alpha: Use tcg_gen_lookup_and_goto_ptr
tcg: Increase hit rate of lookup_tb_ptr
tcg/arm: Use ldr (literal) for goto_tb
tcg/arm: Try pc-relative addresses for movi
tcg/arm: Remove limit on code buffer size
tcg/arm: Use indirect branch for goto_tb
tcg/aarch64: Use ADR in tcg_out_movi
translate-all: consolidate tb init in tb_gen_code
tcg: allocate TB structs before the corresponding translated code
util: add cacheinfo
Signed-off-by: Peter Maydell <address@hidden>
Compare: https://github.com/qemu/qemu/compare/8dfaf23ae1f2...db7a99cdc1d0
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-commits] [qemu/qemu] b255b2: util: add cacheinfo,
GitHub <=