[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v6 59/82] target/arm: Implement SVE mixed sign dot product (i
From: |
Peter Maydell |
Subject: |
Re: [PATCH v6 59/82] target/arm: Implement SVE mixed sign dot product (indexed) |
Date: |
Thu, 13 May 2021 13:57:59 +0100 |
On Fri, 30 Apr 2021 at 22:04, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/arm/cpu.h | 5 +++
> target/arm/helper.h | 4 +++
> target/arm/sve.decode | 4 +++
> target/arm/translate-sve.c | 16 +++++++++
> target/arm/vec_helper.c | 68 ++++++++++++++++++++++++++++++++++++++
> 5 files changed, 97 insertions(+)
> diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
> index 8b7269d8e1..98b707f4f5 100644
> --- a/target/arm/vec_helper.c
> +++ b/target/arm/vec_helper.c
> @@ -677,6 +677,74 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void
> *vm,
> clear_tail(d, opr_sz, simd_maxsz(desc));
> }
>
> +void HELPER(gvec_sudot_idx_b)(void *vd, void *vn, void *vm,
> + void *va, uint32_t desc)
> +{
> + intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
> + intptr_t index = simd_data(desc);
> + int32_t *d = vd, *a = va;
> + int8_t *n = vn;
> + uint8_t *m_indexed = (uint8_t *)vm + index * 4;
> +
> + /*
> + * Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
> + * Otherwise opr_sz is a multiple of 16.
> + */
These are only used by SVE, aren't they ? I guess maintaining
the parallelism with the helpers that are shared is worthwhile.
> + segend = MIN(4, opr_sz_4);
> + i = 0;
> + do {
> + uint8_t m0 = m_indexed[i * 4 + 0];
> + uint8_t m1 = m_indexed[i * 4 + 1];
> + uint8_t m2 = m_indexed[i * 4 + 2];
> + uint8_t m3 = m_indexed[i * 4 + 3];
> +
> + do {
> + d[i] = (a[i] +
> + n[i * 4 + 0] * m0 +
> + n[i * 4 + 1] * m1 +
> + n[i * 4 + 2] * m2 +
> + n[i * 4 + 3] * m3);
> + } while (++i < segend);
> + segend = i + 4;
> + } while (i < opr_sz_4);
> +
> + clear_tail(d, opr_sz, simd_maxsz(desc));
> +}
> +
> +void HELPER(gvec_usdot_idx_b)(void *vd, void *vn, void *vm,
> + void *va, uint32_t desc)
> +{
> + intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
> + intptr_t index = simd_data(desc);
> + uint32_t *d = vd, *a = va;
> + uint8_t *n = vn;
> + int8_t *m_indexed = (int8_t *)vm + index * 4;
> +
> + /*
> + * Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
> + * Otherwise opr_sz is a multiple of 16.
> + */
> + segend = MIN(4, opr_sz_4);
> + i = 0;
> + do {
> + int8_t m0 = m_indexed[i * 4 + 0];
> + int8_t m1 = m_indexed[i * 4 + 1];
> + int8_t m2 = m_indexed[i * 4 + 2];
> + int8_t m3 = m_indexed[i * 4 + 3];
> +
> + do {
> + d[i] = (a[i] +
> + n[i * 4 + 0] * m0 +
> + n[i * 4 + 1] * m1 +
> + n[i * 4 + 2] * m2 +
> + n[i * 4 + 3] * m3);
> + } while (++i < segend);
> + segend = i + 4;
> + } while (i < opr_sz_4);
> +
> + clear_tail(d, opr_sz, simd_maxsz(desc));
> +}
Maybe we should macroify this, as unless I'm misreading them
gvec_sdot_idx_b, gvec_udot_idx_b, gvec_sudot_idx_b and gvec_usdot_idx_b
only differ in the types of the index and the data.
But if you'd rather not you can have a
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
for this version.
thanks
-- PMM
- Re: [PATCH v6 59/82] target/arm: Implement SVE mixed sign dot product (indexed),
Peter Maydell <=