[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/
From: |
Max Chou |
Subject: |
[RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions |
Date: |
Sat, 1 Jun 2024 01:44:47 +0800 |
Hi,
This RFC patch set tries to fix the issue of
https://gitlab.com/qemu-project/qemu/-/issues/2137.
In this new version, we added patches that try to load/store more data
at a time in part of vector continuous load/store (unit-stride/whole
register) instructions with some assumptions (e.g. no masking, no tail
agnostic, perform virtual address resolution once for the entire vector,
etc.) as suggested by Richard Henderson.
This version can improve the performance of the test case provided in
https://gitlab.com/qemu-project/qemu/-/issues/2137#note_1757501369 (from
~13.5 sec to ~1.5 sec) on QEMU user mode.
PS: This RFC patch set only focuses on the vle8.v/vse8.v/vl8re8.v/vs8r.v
instructions. The next version will try to complete other instructions.
Series based on riscv-to-apply.next branch (commit 1806da7).
Max Chou (6):
target/riscv: Separate vector segment ld/st instructions
accel/tcg: Avoid unnecessary call overhead from
qemu_plugin_vcpu_mem_cb
target/riscv: Inline vext_ldst_us and corresponding function for
performance
target/riscv: Add check_probe_[read|write] helper functions
target/riscv: rvv: Optimize v[l|s]e8.v with limitations
target/riscv: rvv: Optimize vl8re8.v/vs8r.v with limitations
accel/tcg/ldst_common.c.inc | 8 +-
target/riscv/helper.h | 8 +
target/riscv/insn32.decode | 11 +-
target/riscv/insn_trans/trans_rvv.c.inc | 454 +++++++++++++++++++++++-
target/riscv/vector_helper.c | 142 ++++++--
5 files changed, 591 insertions(+), 32 deletions(-)
--
2.34.1
- [RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions,
Max Chou <=
- [RFC PATCH v2 1/6] target/riscv: Separate vector segment ld/st instructions, Max Chou, 2024/05/31
- [RFC PATCH v2 2/6] accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb, Max Chou, 2024/05/31
- [RFC PATCH v2 3/6] target/riscv: Inline vext_ldst_us and corresponding function for performance, Max Chou, 2024/05/31
- [RFC PATCH v2 4/6] target/riscv: Add check_probe_[read|write] helper functions, Max Chou, 2024/05/31
- [RFC PATCH v2 5/6] target/riscv: rvv: Optimize v[l|s]e8.v with limitations, Max Chou, 2024/05/31
- [RFC PATCH v2 6/6] target/riscv: rvv: Optimize vl8re8.v/vs8r.v with limitations, Max Chou, 2024/05/31