[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 11/48] target/i386: optimize CX handling in repeated string operat
From: |
Paolo Bonzini |
Subject: |
[PULL 11/48] target/i386: optimize CX handling in repeated string operations |
Date: |
Fri, 24 Jan 2025 10:44:05 +0100 |
In a repeated string operation, CX/ECX will be decremented until it
is 0 but never underflow. Use this observation to avoid a deposit or
zero-extend operation if the address size of the operation is smaller
than MO_TL.
As in the previous patch, the patch is structured to include some
preparatory work for subsequent changes. In particular, introducing
cx_next prepares for when ECX will be decremented *before* calling
fn(s, ot), and therefore cannot yet be written back to cpu_regs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/tcg/translate.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7a3caf8b996..0a8f3c89514 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1339,6 +1339,7 @@ static void do_gen_rep(DisasContext *s, MemOp ot,
{
TCGLabel *done = gen_new_label();
target_ulong cx_mask = MAKE_64BIT_MASK(0, 8 << s->aflag);
+ TCGv cx_next = tcg_temp_new();
bool had_rf = s->flags & HF_RF_MASK;
/*
@@ -1364,7 +1365,19 @@ static void do_gen_rep(DisasContext *s, MemOp ot,
tcg_gen_brcondi_tl(TCG_COND_TSTEQ, cpu_regs[R_ECX], cx_mask, done);
fn(s, ot);
- gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
+
+ tcg_gen_subi_tl(cx_next, cpu_regs[R_ECX], 1);
+
+ /*
+ * Write back cx_next to CX/ECX/RCX. There can be no carry, so zero
+ * extend if needed but do not do expensive deposit operations.
+ */
+#ifdef TARGET_X86_64
+ if (s->aflag == MO_32) {
+ tcg_gen_ext32u_tl(cx_next, cx_next);
+ }
+#endif
+ tcg_gen_mov_tl(cpu_regs[R_ECX], cx_next);
gen_update_cc_op(s);
/* Leave if REP condition fails. */
--
2.48.1
- [PULL 02/48] target/i386: inline gen_jcc into sole caller, (continued)
- [PULL 02/48] target/i386: inline gen_jcc into sole caller, Paolo Bonzini, 2025/01/24
- [PULL 03/48] target/i386: remove trailing 1 from gen_{j, cmov, set}cc1, Paolo Bonzini, 2025/01/24
- [PULL 05/48] target/i386: unify choice between single and repeated string instructions, Paolo Bonzini, 2025/01/24
- [PULL 04/48] target/i386: unify REP and REPZ/REPNZ generation, Paolo Bonzini, 2025/01/24
- [PULL 06/48] target/i386: reorganize ops emitted by do_gen_rep, drop repz_opt, Paolo Bonzini, 2025/01/24
- [PULL 07/48] target/i386: tcg: move gen_set/reset_* earlier in the file, Paolo Bonzini, 2025/01/24
- [PULL 08/48] target/i386: fix RF handling for string instructions, Paolo Bonzini, 2025/01/24
- [PULL 09/48] target/i386: make cc_op handling more explicit for repeated string instructions., Paolo Bonzini, 2025/01/24
- [PULL 10/48] target/i386: do not use gen_op_jz_ecx for repeated string operations, Paolo Bonzini, 2025/01/24
- [PULL 14/48] target/i386: extract common bits of gen_repz/gen_repz_nz, Paolo Bonzini, 2025/01/24
- [PULL 11/48] target/i386: optimize CX handling in repeated string operations,
Paolo Bonzini <=
- [PULL 12/48] target/i386: execute multiple REP/REPZ iterations without leaving TB, Paolo Bonzini, 2025/01/24
- [PULL 16/48] target/i386: Introduce SierraForest-v2 model, Paolo Bonzini, 2025/01/24
- [PULL 13/48] target/i386: pull computation of string update value out of loop, Paolo Bonzini, 2025/01/24
- [PULL 17/48] target/i386: Export BHI_NO bit to guests, Paolo Bonzini, 2025/01/24
- [PULL 15/48] target/i386: avoid using s->tmp0 for add to implicit registers, Paolo Bonzini, 2025/01/24
- [PULL 22/48] rust/pl011: Avoid bindings::*, Paolo Bonzini, 2025/01/24
- [PULL 23/48] memattrs: Convert unspecified member to bool, Paolo Bonzini, 2025/01/24
- [PULL 26/48] rust: vmstate: implement VMState for non-leaf types, Paolo Bonzini, 2025/01/24
- [PULL 18/48] target/i386: Add new CPU model ClearwaterForest, Paolo Bonzini, 2025/01/24
- [PULL 20/48] stub: Fix build failure with --enable-user --disable-system --enable-tools, Paolo Bonzini, 2025/01/24