qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging()


From: Richard Henderson
Subject: Re: [PATCH 37/76] target/arm: Define and use new write_fp_*reg_merging() functions
Date: Sat, 25 Jan 2025 09:52:48 -0800
User-agent: Mozilla Thunderbird

On 1/24/25 08:27, Peter Maydell wrote:
For FEAT_AFP's FPCR.NEP bit, we need to programmatically change the
behaviour of the writeback of the result for most SIMD scalar
operations, so that instead of zeroing the upper part of the result
register it merges the upper elements from one of the input
registers.

Provide new functions write_fp_*reg_merging() which can be used
instead of the existing write_fp_*reg() functions when we want this
"merge the result with one of the input registers if FPCR.NEP is
enabled" handling, and use them in do_fp3_scalar_with_fpsttype().

Note that (as documented in the description of the FPCR.NEP bit)
which input register to use as the merge source varies by
instruction: for these 2-input scalar operations, the comparison
instructions take from Rm, not Rn.

We'll extend this to also provide the merging behaviour for
the remaining scalar insns in subsequent commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
  target/arm/tcg/translate-a64.c | 117 +++++++++++++++++++++++++--------
  1 file changed, 91 insertions(+), 26 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index d34672a8ba6..19a4ae14c15 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -665,6 +665,68 @@ static void write_fp_sreg(DisasContext *s, int reg, 
TCGv_i32 v)
      write_fp_dreg(s, reg, tmp);
  }
+/*
+ * Write a double result to 128 bit vector register reg, honouring FPCR.NEP:
+ * - if FPCR.NEP == 0, clear the high elements of reg
+ * - if FPCR.NEP == 1, set the high elements of reg from mergereg
+ *   (i.e. merge the result with those high elements)
+ * In either case, SVE register bits above 128 are zeroed (per R_WKYLB).
+ */
+static void write_fp_dreg_merging(DisasContext *s, int reg, int mergereg,
+                                  TCGv_i64 v)
+{
+    if (!s->fpcr_nep) {
+        write_fp_dreg(s, reg, v);
+        return;
+    }
+
+    /*
+     * Move from mergereg to reg; this sets the high elements and
+     * clears the bits above 128 as a side effect.
+     */
+    tcg_gen_gvec_mov(MO_64, fp_reg_offset(s, reg, MO_64),
+                     fp_reg_offset(s, mergereg, MO_64),
+                     16, vec_full_reg_size(s));

I think this would be clearer with vec_full_reg_offset(), though the result is correct either way.

Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]