[PULL 26/27] target/arm: Implement FPST_STD

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PULL 26/27] target/arm: Implement FPST_STD_F16 fpstatus

From:	Peter Maydell
Subject:	[PULL 26/27] target/arm: Implement FPST_STD_F16 fpstatus
Date:	Mon, 24 Aug 2020 10:48:10 +0100

Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations.  However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200806104453.30393-4-peter.maydell@linaro.org
---
 target/arm/cpu.h        | 9 ++++++++-
 target/arm/translate.h  | 3 ++-
 target/arm/cpu.c        | 3 +++
 target/arm/vfp_helper.c | 5 +++++
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9d2845c1797..ac857bdc2c1 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -609,6 +609,8 @@ typedef struct CPUARMState {
          *  fp_status: is the "normal" fp status.
          *  fp_status_fp16: used for half-precision calculations
          *  standard_fp_status : the ARM "Standard FPSCR Value"
+         *  standard_fp_status_fp16 : used for half-precision
+         *       calculations with the ARM "Standard FPSCR Value"
          *
          * Half-precision operations are governed by a separate
          * flush-to-zero control bit in FPSCR:FZ16. We pass a separate
@@ -619,15 +621,20 @@ typedef struct CPUARMState {
          * Neon) which the architecture defines as controlled by the
          * standard FPSCR value rather than the FPSCR.
          *
+         * The "standard FPSCR but for fp16 ops" is needed because
+         * the "standard FPSCR" tracks the FPSCR.FZ16 bit rather than
+         * using a fixed value for it.
+         *
          * To avoid having to transfer exception bits around, we simply
          * say that the FPSCR cumulative exception flags are the logical
-         * OR of the flags in the three fp statuses. This relies on the
+         * OR of the flags in the four fp statuses. This relies on the
          * only thing which needs to read the exception flags being
          * an explicit FPSCR read.
          */
         float_status fp_status;
         float_status fp_status_f16;
         float_status standard_fp_status;
+        float_status standard_fp_status_f16;
 
         /* ZCR_EL[1-3] */
         uint64_t zcr_el[4];
diff --git a/target/arm/translate.h b/target/arm/translate.h
index e3680e65479..6d6d4c0f425 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -436,7 +436,8 @@ static inline TCGv_ptr fpstatus_ptr(ARMFPStatusFlavour 
flavour)
         offset = offsetof(CPUARMState, vfp.standard_fp_status);
         break;
     case FPST_STD_F16:
-        /* Not yet used or implemented: fall through to assert */
+        offset = offsetof(CPUARMState, vfp.standard_fp_status_f16);
+        break;
     default:
         g_assert_not_reached();
     }
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 111579554fb..6b382fcd60e 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -391,12 +391,15 @@ static void arm_cpu_reset(DeviceState *dev)
     set_flush_to_zero(1, &env->vfp.standard_fp_status);
     set_flush_inputs_to_zero(1, &env->vfp.standard_fp_status);
     set_default_nan_mode(1, &env->vfp.standard_fp_status);
+    set_default_nan_mode(1, &env->vfp.standard_fp_status_f16);
     set_float_detect_tininess(float_tininess_before_rounding,
                               &env->vfp.fp_status);
     set_float_detect_tininess(float_tininess_before_rounding,
                               &env->vfp.standard_fp_status);
     set_float_detect_tininess(float_tininess_before_rounding,
                               &env->vfp.fp_status_f16);
+    set_float_detect_tininess(float_tininess_before_rounding,
+                              &env->vfp.standard_fp_status_f16);
 #ifndef CONFIG_USER_ONLY
     if (kvm_enabled()) {
         kvm_arm_reset_vcpu(cpu);
diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index 60dcd4bf145..64266ece620 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -93,6 +93,8 @@ static uint32_t vfp_get_fpscr_from_host(CPUARMState *env)
     /* FZ16 does not generate an input denormal exception.  */
     i |= (get_float_exception_flags(&env->vfp.fp_status_f16)
           & ~float_flag_input_denormal);
+    i |= (get_float_exception_flags(&env->vfp.standard_fp_status_f16)
+          & ~float_flag_input_denormal);
     return vfp_exceptbits_from_host(i);
 }
 
@@ -124,7 +126,9 @@ static void vfp_set_fpscr_to_host(CPUARMState *env, 
uint32_t val)
     if (changed & FPCR_FZ16) {
         bool ftz_enabled = val & FPCR_FZ16;
         set_flush_to_zero(ftz_enabled, &env->vfp.fp_status_f16);
+        set_flush_to_zero(ftz_enabled, &env->vfp.standard_fp_status_f16);
         set_flush_inputs_to_zero(ftz_enabled, &env->vfp.fp_status_f16);
+        set_flush_inputs_to_zero(ftz_enabled, 
&env->vfp.standard_fp_status_f16);
     }
     if (changed & FPCR_FZ) {
         bool ftz_enabled = val & FPCR_FZ;
@@ -146,6 +150,7 @@ static void vfp_set_fpscr_to_host(CPUARMState *env, 
uint32_t val)
     set_float_exception_flags(i, &env->vfp.fp_status);
     set_float_exception_flags(0, &env->vfp.fp_status_f16);
     set_float_exception_flags(0, &env->vfp.standard_fp_status);
+    set_float_exception_flags(0, &env->vfp.standard_fp_status_f16);
 }
 
 #else
-- 
2.20.1

[Prev in Thread]

Current Thread

[Next in Thread]

[PULL 17/27] target/arm: Tidy up disas_arm_insn(), (continued)
- [PULL 17/27] target/arm: Tidy up disas_arm_insn(), Peter Maydell, 2020/08/24
- [PULL 18/27] target/arm: Do M-profile NOCP checks early and via decodetree, Peter Maydell, 2020/08/24
- [PULL 20/27] target/arm: Remove ARCH macro, Peter Maydell, 2020/08/24
- [PULL 19/27] target/arm: Convert T32 coprocessor insns to decodetree, Peter Maydell, 2020/08/24
- [PULL 21/27] target/arm: Delete unused VFP_DREG macros, Peter Maydell, 2020/08/24
- [PULL 22/27] target/arm/translate.c: Delete/amend incorrect comments, Peter Maydell, 2020/08/24
- [PULL 23/27] target/arm: Delete unused ARM_FEATURE_CRC, Peter Maydell, 2020/08/24
- [PULL 25/27] target/arm: Make A32/T32 use new fpstatus_ptr() API, Peter Maydell, 2020/08/24
- [PULL 24/27] target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr(), Peter Maydell, 2020/08/24
- [PULL 27/27] target/arm: Use correct FPST for VCMLA, VCADD on fp16, Peter Maydell, 2020/08/24
- [PULL 26/27] target/arm: Implement FPST_STD_F16 fpstatus, Peter Maydell <=
- Re: [PULL 00/27] target-arm queue, Peter Maydell, 2020/08/24

Prev by Date: [PULL 27/27] target/arm: Use correct FPST for VCMLA, VCADD on fp16
Next by Date: Re: [External] Re: [PATCH] virtiofsd: add -o allow_directio|no_directio option
Previous by thread: [PULL 27/27] target/arm: Use correct FPST for VCMLA, VCADD on fp16
Next by thread: Re: [PULL 00/27] target-arm queue
Index(es):
- Date
- Thread