[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [PATCH 42/42] target/arm: Fix short-vector increment beha
From: |
Richard Henderson |
Subject: |
Re: [Qemu-arm] [PATCH 42/42] target/arm: Fix short-vector increment behaviour |
Date: |
Sat, 8 Jun 2019 14:26:13 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 |
On 6/6/19 12:46 PM, Peter Maydell wrote:
> For VFP short vectors, the VFP registers are divided into a
> series of banks: for single-precision these are s0-s7, s8-s15,
> s16-s23 and s24-s31; for double-precision they are d0-d3,
> d4-d7, ... d28-d31. Some banks are "scalar" meaning that
> use of a register within them triggers a pure-scalar or
> mixed vector-scalar operation rather than a full vector
> operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
> When using a bank as part of a vector operation, we
> iterate through it, increasing the register number by
> the specified stride each time, and wrapping around to
> the beginning of the bank.
>
> Unfortunately our calculation of the "increment" part of this
> was incorrect:
> vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
> will only do the intended thing if bank_mask has exactly
> one set high bit. For instance for doubles (bank_mask = 0xc),
> if we start with vd = 6 and delta_d = 2 then vd is updated
> to 12 rather than the intended 4.
>
> This only causes problems in the unlikely case that the
> starting register is not the first in its bank: if the
> register number doesn't have to wrap around then the
> expression happens to give the right answer.
>
> Fix this bug by abstracting out the "check whether register
> is in a scalar bank" and "advance register within bank"
> operations to utility functions which use the right
> bit masking operations.
>
> Signed-off-by: Peter Maydell <address@hidden>
> ---
> target/arm/translate-vfp.inc.c | 100 ++++++++++++++++++++-------------
> 1 file changed, 60 insertions(+), 40 deletions(-)
Reviewed-by: Richard Henderson <address@hidden>
r~
- Re: [Qemu-arm] [PATCH 16/42] target/arm: Convert the VFP load/store multiple insns to decodetree, (continued)
- [Qemu-arm] [PATCH 17/42] target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 13/42] target/arm: Convert "single-precision" register moves to decodetree, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 11/42] target/arm: Add helpers for VFP register loads and stores, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 34/42] target/arm: Convert the VCVT-from-f16 insns to decodetree, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 42/42] target/arm: Fix short-vector increment behaviour, Peter Maydell, 2019/06/06
- Re: [Qemu-arm] [PATCH 42/42] target/arm: Fix short-vector increment behaviour,
Richard Henderson <=
- [Qemu-arm] [PATCH 30/42] target/arm: Convert VNEG to decodetree, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 14/42] target/arm: Convert VFP two-register transfer insns to decodetree, Peter Maydell, 2019/06/06
- [Qemu-arm] [PATCH 35/42] target/arm: Convert the VCVT-to-f16 insns to decodetree, Peter Maydell, 2019/06/06