qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 13/17] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-


From: Richard Henderson
Subject: Re: [PATCH v2 13/17] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree
Date: Wed, 13 May 2020 13:28:54 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/12/20 9:39 AM, Peter Maydell wrote:
> Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
> decodetree.
> 
> We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
> need a loop function do_3same_fp().  This takes a reads_vd parameter
> to do_3same_fp() which tells it to load the old value into vd before
> calling the callback function, in the same way that the do_vfp_3op_sp()
> and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
> only uses in this patch pass reads_vd == true, but later commits
> will use reads_vd == false.)
> 
> This conversion fixes in passing an underdecoding for VMUL
> (originally reported by Fredrik Strupe <address@hidden>): bit 1
> of the 'size' field must be 0.  The old decoder didn't enforce this,
> but the decodetree pattern does.
> 
> The gen_VMLA_fp_reg() function performs the addition operation
> with the operands in the opposite order to the old decoder:
> since Neon sets 'default NaN mode' float32_add operations are
> commutative so there is no behaviour difference, but putting
> them this way around matches the Arm ARM pseudocode and the
> required operation order for the subtraction in gen_VMLS_fp_reg().
> 
> Signed-off-by: Peter Maydell <address@hidden>
> ---
>  target/arm/neon-dp.decode       |  3 ++
>  target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
>  target/arm/translate.c          | 17 +------
>  3 files changed, 85 insertions(+), 16 deletions(-)

Reviewed-by: Richard Henderson <address@hidden>


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]