Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging()

From:	Richard Henderson
Subject:	Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging() functions
Date:	Sat, 25 Jan 2025 09:52:48 -0800
User-agent:	Mozilla Thunderbird

On 1/24/25 08:27, Peter Maydell wrote:

For FEAT_AFP's FPCR.NEP bit, we need to programmatically change the
behaviour of the writeback of the result for most SIMD scalar
operations, so that instead of zeroing the upper part of the result
register it merges the upper elements from one of the input
registers.

Provide new functions write_fp_*reg_merging() which can be used
instead of the existing write_fp_*reg() functions when we want this
"merge the result with one of the input registers if FPCR.NEP is
enabled" handling, and use them in do_fp3_scalar_with_fpsttype().

Note that (as documented in the description of the FPCR.NEP bit)
which input register to use as the merge source varies by
instruction: for these 2-input scalar operations, the comparison
instructions take from Rm, not Rn.

We'll extend this to also provide the merging behaviour for
the remaining scalar insns in subsequent commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
  target/arm/tcg/translate-a64.c | 117 +++++++++++++++++++++++++--------
  1 file changed, 91 insertions(+), 26 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index d34672a8ba6..19a4ae14c15 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -665,6 +665,68 @@ static void write_fp_sreg(DisasContext *s, int reg, 
TCGv_i32 v)
      write_fp_dreg(s, reg, tmp);
  }

+/*

+ * Write a double result to 128 bit vector register reg, honouring FPCR.NEP:
+ * - if FPCR.NEP == 0, clear the high elements of reg
+ * - if FPCR.NEP == 1, set the high elements of reg from mergereg
+ *   (i.e. merge the result with those high elements)
+ * In either case, SVE register bits above 128 are zeroed (per R_WKYLB).
+ */
+static void write_fp_dreg_merging(DisasContext *s, int reg, int mergereg,
+                                  TCGv_i64 v)
+{
+    if (!s->fpcr_nep) {
+        write_fp_dreg(s, reg, v);
+        return;
+    }
+
+    /*
+     * Move from mergereg to reg; this sets the high elements and
+     * clears the bits above 128 as a side effect.
+     */
+    tcg_gen_gvec_mov(MO_64, fp_reg_offset(s, reg, MO_64),
+                     fp_reg_offset(s, mergereg, MO_64),
+                     16, vec_full_reg_size(s));

I think this would be clearer with vec_full_reg_offset(), though the result is correcteither way.


Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 46/76] target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX, (continued)
- [PATCH 67/76] target/arm: Handle FPCR.AH in SVE FTMAD, Peter Maydell, 2025/01/24
  - Re: [PATCH 67/76] target/arm: Handle FPCR.AH in SVE FTMAD, Richard Henderson, 2025/01/26
- [PATCH 70/76] target/arm: Implement increased precision FRECPE, Peter Maydell, 2025/01/24
  - Re: [PATCH 70/76] target/arm: Implement increased precision FRECPE, Richard Henderson, 2025/01/26
- [PATCH 66/76] target/arm: Handle FPCR.AH in SVE FTSSEL, Peter Maydell, 2025/01/24
  - Re: [PATCH 66/76] target/arm: Handle FPCR.AH in SVE FTSSEL, Richard Henderson, 2025/01/26
- [PATCH 34/76] target/arm: Use FPST_FPCR_AH for BFCVT* insns, Peter Maydell, 2025/01/24
  - Re: [PATCH 34/76] target/arm: Use FPST_FPCR_AH for BFCVT* insns, Richard Henderson, 2025/01/25
- [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging() functions, Peter Maydell, 2025/01/24
  - Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging() functions, Richard Henderson <=
- [PATCH 62/76] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns, Peter Maydell, 2025/01/24
  - Re: [PATCH 62/76] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns, Richard Henderson, 2025/01/26
- [PATCH 38/76] target/arm: Handle FPCR.NEP for 3-input scalar operations, Peter Maydell, 2025/01/24
  - Re: [PATCH 38/76] target/arm: Handle FPCR.NEP for 3-input scalar operations, Richard Henderson, 2025/01/25
- [PATCH 73/76] target/i386: Detect flush-to-zero after rounding, Peter Maydell, 2025/01/24
  - Re: [PATCH 73/76] target/i386: Detect flush-to-zero after rounding, Richard Henderson, 2025/01/26
- [PATCH 39/76] target/arm: Handle FPCR.NEP for BFCVT scalar, Peter Maydell, 2025/01/24
  - Re: [PATCH 39/76] target/arm: Handle FPCR.NEP for BFCVT scalar, Richard Henderson, 2025/01/25
- [PATCH 40/76] target/arm: Handle FPCR.NEP for 1-input scalar operations, Peter Maydell, 2025/01/24
  - Re: [PATCH 40/76] target/arm: Handle FPCR.NEP for 1-input scalar operations, Richard Henderson, 2025/01/26

Prev by Date: Re: [PATCH 36/76] target/arm: Add FPCR.NEP to TBFLAGS
Next by Date: Re: [PATCH 38/76] target/arm: Handle FPCR.NEP for 3-input scalar operations
Previous by thread: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging() functions
Next by thread: [PATCH 62/76] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns
Index(es):
- Date
- Thread