[PULL 10/18] target/hppa: Optimize UADDCM with no condition

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PULL 10/18] target/hppa: Optimize UADDCM with no condition

From:	Richard Henderson
Subject:	[PULL 10/18] target/hppa: Optimize UADDCM with no condition
Date:	Fri, 29 Mar 2024 12:31:03 -1000

With r1 as zero is by far the most common usage of UADDCM, as the
easiest way to invert a register.  The compiler does occasionally
use the addition step as well, and we can simplify that to avoid
a temp and write directly into the destination.

Tested-by: Helge Deller <deller@gmx.de>
Reviewed-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/hppa/translate.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index a3f425d861..3fc3e7754c 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -2763,9 +2763,29 @@ static bool do_uaddcm(DisasContext *ctx, arg_rrr_cf_d 
*a, bool is_tc)
 {
     TCGv_i64 tcg_r1, tcg_r2, tmp;
 
-    if (a->cf) {
-        nullify_over(ctx);
+    if (a->cf == 0) {
+        tcg_r2 = load_gpr(ctx, a->r2);
+        tmp = dest_gpr(ctx, a->t);
+
+        if (a->r1 == 0) {
+            /* UADDCM r0,src,dst is the common idiom for dst = ~src. */
+            tcg_gen_not_i64(tmp, tcg_r2);
+        } else {
+            /*
+             * Recall that r1 - r2 == r1 + ~r2 + 1.
+             * Thus r1 + ~r2 == r1 - r2 - 1,
+             * which does not require an extra temporary.
+             */
+            tcg_r1 = load_gpr(ctx, a->r1);
+            tcg_gen_sub_i64(tmp, tcg_r1, tcg_r2);
+            tcg_gen_subi_i64(tmp, tmp, 1);
+        }
+        save_gpr(ctx, a->t, tmp);
+        cond_free(&ctx->null_cond);
+        return true;
     }
+
+    nullify_over(ctx);
     tcg_r1 = load_gpr(ctx, a->r1);
     tcg_r2 = load_gpr(ctx, a->r2);
     tmp = tcg_temp_new_i64();
-- 
2.34.1

[Prev in Thread]

Current Thread

[Next in Thread]

[PULL 00/18] target/hppa patch queue, Richard Henderson, 2024/03/29
- [PULL 01/18] target/hppa: Fix BE,L set of sr0, Richard Henderson, 2024/03/29
- [PULL 05/18] target/hppa: Mark interval timer write as io, Richard Henderson, 2024/03/29
- [PULL 02/18] target/hppa: Fix B,GATE for wide mode, Richard Henderson, 2024/03/29
- [PULL 11/18] target/hppa: Fix unit carry conditions, Richard Henderson, 2024/03/29
- [PULL 03/18] target/hppa: Handle unit conditions for wide mode, Richard Henderson, 2024/03/29
- [PULL 08/18] target/hppa: Use gva_offset_mask() everywhere, Richard Henderson, 2024/03/29
- [PULL 10/18] target/hppa: Optimize UADDCM with no condition, Richard Henderson <=
- [PULL 06/18] target/hppa: Tidy read of interval timer, Richard Henderson, 2024/03/29
- [PULL 16/18] target/hppa: Move diag argument handling to decodetree, Richard Henderson, 2024/03/29
- [PULL 07/18] target/hppa: Fix EIRR, EIEM versus icount, Richard Henderson, 2024/03/29
- [PULL 04/18] target/hppa: Fix ADD/SUB trap on overflow for narrow mode, Richard Henderson, 2024/03/29
- [PULL 09/18] target/hppa: Fix DCOR reconstruction of carry bits, Richard Henderson, 2024/03/29
- [PULL 12/18] target/hppa: Squash d for pa1.x during decode, Richard Henderson, 2024/03/29
- [PULL 13/18] target/hppa: Replace c with uv in do_cond, Richard Henderson, 2024/03/29
- [PULL 18/18] target/hppa: Clear psw_n for BE on use_nullify_skip path, Richard Henderson, 2024/03/29
- [PULL 17/18] target/hppa: Add diag instructions to set/restore shadow registers, Richard Henderson, 2024/03/29
- [PULL 14/18] target/hppa: Fix overflow computation for shladd, Richard Henderson, 2024/03/29

Prev by Date: [PULL 08/18] target/hppa: Use gva_offset_mask() everywhere
Next by Date: [PULL 06/18] target/hppa: Tidy read of interval timer
Previous by thread: [PULL 08/18] target/hppa: Use gva_offset_mask() everywhere
Next by thread: [PULL 06/18] target/hppa: Tidy read of interval timer
Index(es):
- Date
- Thread