[PATCH 05/17] target/arm: Simplify do_reduction

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 05/17] target/arm: Simplify do_reduction_op

From:	Richard Henderson
Subject:	[PATCH 05/17] target/arm: Simplify do_reduction_op
Date:	Wed, 17 Jul 2024 16:08:51 +1000

Use simple shift and add instead of ctpop, ctz, shift and mask.
Unlike SVE, there is no predicate to disable elements.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-a64.c | 40 +++++++++++-----------------------
 1 file changed, 13 insertions(+), 27 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index e0314a1253..6d2e1a2d80 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -8986,34 +8986,23 @@ static void disas_data_proc_fp(DisasContext *s, 
uint32_t insn)
  * important for correct NaN propagation that we do these
  * operations in exactly the order specified by the pseudocode.
  *
- * This is a recursive function, TCG temps should be freed by the
- * calling function once it is done with the values.
+ * This is a recursive function.
  */
 static TCGv_i32 do_reduction_op(DisasContext *s, int fpopcode, int rn,
-                                int esize, int size, int vmap, TCGv_ptr fpst)
+                                MemOp esz, int ebase, int ecount, TCGv_ptr 
fpst)
 {
-    if (esize == size) {
-        int element;
-        MemOp msize = esize == 16 ? MO_16 : MO_32;
-        TCGv_i32 tcg_elem;
-
-        /* We should have one register left here */
-        assert(ctpop8(vmap) == 1);
-        element = ctz32(vmap);
-        assert(element < 8);
-
-        tcg_elem = tcg_temp_new_i32();
-        read_vec_element_i32(s, tcg_elem, rn, element, msize);
+    if (ecount == 1) {
+        TCGv_i32 tcg_elem = tcg_temp_new_i32();
+        read_vec_element_i32(s, tcg_elem, rn, ebase, esz);
         return tcg_elem;
     } else {
-        int bits = size / 2;
-        int shift = ctpop8(vmap) / 2;
-        int vmap_lo = (vmap >> shift) & vmap;
-        int vmap_hi = (vmap & ~vmap_lo);
+        int half = ecount >> 1;
         TCGv_i32 tcg_hi, tcg_lo, tcg_res;
 
-        tcg_hi = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_hi, fpst);
-        tcg_lo = do_reduction_op(s, fpopcode, rn, esize, bits, vmap_lo, fpst);
+        tcg_hi = do_reduction_op(s, fpopcode, rn, esz,
+                                 ebase + half, half, fpst);
+        tcg_lo = do_reduction_op(s, fpopcode, rn, esz,
+                                 ebase, half, fpst);
         tcg_res = tcg_temp_new_i32();
 
         switch (fpopcode) {
@@ -9064,7 +9053,6 @@ static void disas_simd_across_lanes(DisasContext *s, 
uint32_t insn)
     bool is_u = extract32(insn, 29, 1);
     bool is_fp = false;
     bool is_min = false;
-    int esize;
     int elements;
     int i;
     TCGv_i64 tcg_res, tcg_elt;
@@ -9111,8 +9099,7 @@ static void disas_simd_across_lanes(DisasContext *s, 
uint32_t insn)
         return;
     }
 
-    esize = 8 << size;
-    elements = (is_q ? 128 : 64) / esize;
+    elements = (is_q ? 16 : 8) >> size;
 
     tcg_res = tcg_temp_new_i64();
     tcg_elt = tcg_temp_new_i64();
@@ -9167,9 +9154,8 @@ static void disas_simd_across_lanes(DisasContext *s, 
uint32_t insn)
          */
         TCGv_ptr fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : 
FPST_FPCR);
         int fpopcode = opcode | is_min << 4 | is_u << 5;
-        int vmap = (1 << elements) - 1;
-        TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, esize,
-                                             (is_q ? 128 : 64), vmap, fpst);
+        TCGv_i32 tcg_res32 = do_reduction_op(s, fpopcode, rn, size,
+                                             0, elements, fpst);
         tcg_gen_extu_i32_i64(tcg_res, tcg_res32);
     }
 
-- 
2.43.0

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 01/17] target/arm: Use tcg_gen_extract2_i64 for EXT, (continued)
- [PATCH 01/17] target/arm: Use tcg_gen_extract2_i64 for EXT, Richard Henderson, 2024/07/17
  - Re: [PATCH 01/17] target/arm: Use tcg_gen_extract2_i64 for EXT, Philippe Mathieu-Daudé, 2024/07/17
- [PATCH 02/17] target/arm: Convert EXT to decodetree, Richard Henderson, 2024/07/17
  - Re: [PATCH 02/17] target/arm: Convert EXT to decodetree, Philippe Mathieu-Daudé, 2024/07/17
- [PATCH 03/17] target/arm: Convert TBL, TBX to decodetree, Richard Henderson, 2024/07/17
  - Re: [PATCH 03/17] target/arm: Convert TBL, TBX to decodetree, Philippe Mathieu-Daudé, 2024/07/17
- [PATCH 04/17] target/arm: Convert UZP, TRN, ZIP to decodetree, Richard Henderson, 2024/07/17
- [PATCH 07/17] target/arm: Convert FMAXNMV, FMINNMV, FMAXV, FMINV to decodetree, Richard Henderson, 2024/07/17
- [PATCH 08/17] target/arm: Convert FMOVI (scalar, immediate) to decodetree, Richard Henderson, 2024/07/17
  - Re: [PATCH 08/17] target/arm: Convert FMOVI (scalar, immediate) to decodetree, Philippe Mathieu-Daudé, 2024/07/17
- [PATCH 05/17] target/arm: Simplify do_reduction_op, Richard Henderson <=
- [PATCH 06/17] target/arm: Convert ADDV, *ADDLV, *MAXV, *MINV to decodetree, Richard Henderson, 2024/07/17
- [PATCH 09/17] target/arm: Convert MOVI, FMOV, ORR, BIC (vector immediate) to decodetree, Richard Henderson, 2024/07/17
- [PATCH 10/17] target/arm: Introduce gen_gvec_sshr, gen_gvec_ushr, Richard Henderson, 2024/07/17
- [PATCH 11/17] target/arm: Fix whitespace near gen_srshr64_i64, Richard Henderson, 2024/07/17
  - Re: [PATCH 11/17] target/arm: Fix whitespace near gen_srshr64_i64, Philippe Mathieu-Daudé, 2024/07/17
- [PATCH 14/17] target/arm: Clear high SVE elements in handle_vec_simd_wshli, Richard Henderson, 2024/07/17
- [PATCH 12/17] target/arm: Convert handle_vec_simd_shri to decodetree, Richard Henderson, 2024/07/17
- [PATCH 13/17] target/arm: Convet handle_vec_simd_shli to decodetree, Richard Henderson, 2024/07/17
- [PATCH 16/17] target/arm: Convert SSHLL, USHLL to decodetree, Richard Henderson, 2024/07/17
- [PATCH 17/17] target/arm: Push tcg_rnd into handle_shri_with_rndacc, Richard Henderson, 2024/07/17

Prev by Date: [PATCH 08/17] target/arm: Convert FMOVI (scalar, immediate) to decodetree
Next by Date: [PATCH 06/17] target/arm: Convert ADDV, *ADDLV, *MAXV, *MINV to decodetree
Previous by thread: Re: [PATCH 08/17] target/arm: Convert FMOVI (scalar, immediate) to decodetree
Next by thread: [PATCH 06/17] target/arm: Convert ADDV, *ADDLV, *MAXV, *MINV to decodetree
Index(es):
- Date
- Thread