[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 19/45] target/arm: Implement SME MOVA
From: |
Peter Maydell |
Subject: |
Re: [PATCH v5 19/45] target/arm: Implement SME MOVA |
Date: |
Wed, 6 Jul 2022 17:47:23 +0100 |
On Wed, 6 Jul 2022 at 10:11, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We can reuse the SVE functions for implementing moves to/from
> horizontal tile slices, but we need new ones for moves to/from
> vertical tile slices.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> +/*
> + * Move Zreg vector to ZArray column.
> + */
> +#define DO_MOVA_C(NAME, TYPE, H) \
> +void HELPER(NAME)(void *za, void *vn, void *vg, uint32_t desc) \
> +{ \
> + int i, oprsz = simd_oprsz(desc); \
> + for (i = 0; i < oprsz; ) { \
> + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
> + do { \
> + if (pg & 1) { \
> + *(TYPE *)(za + tile_vslice_offset(i)) = *(TYPE *)(vn +
> H(i)); \
> + } \
> + i += sizeof(TYPE); \
> + pg >>= sizeof(TYPE); \
> + } while (i & 15); \
> + } \
> +}
> +
> +DO_MOVA_C(sme_mova_cz_b, uint8_t, H1)
> +DO_MOVA_C(sme_mova_cz_h, uint16_t, H2)
> +DO_MOVA_C(sme_mova_cz_s, uint32_t, H4)
i is a byte offset in this loop, so shouldn't these be using H1_2 and H1_4 ?
> +/*
> + * Move ZArray column to Zreg vector.
> + */
> +#define DO_MOVA_Z(NAME, TYPE, H) \
> +void HELPER(NAME)(void *vd, void *za, void *vg, uint32_t desc) \
> +{ \
> + int i, oprsz = simd_oprsz(desc); \
> + for (i = 0; i < oprsz; ) { \
> + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
> + do { \
> + if (pg & 1) { \
> + *(TYPE *)(vd + H(i)) = *(TYPE *)(za +
> tile_vslice_offset(i)); \
> + } \
> + i += sizeof(TYPE); \
> + pg >>= sizeof(TYPE); \
> + } while (i & 15); \
> + } \
> +}
> +
> +DO_MOVA_Z(sme_mova_zc_b, uint8_t, H1)
> +DO_MOVA_Z(sme_mova_zc_h, uint16_t, H2)
> +DO_MOVA_Z(sme_mova_zc_s, uint32_t, H4)
Similarly here?
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
- [PATCH v5 10/45] target/arm: Mark string/histo/crypto as non-streaming, (continued)
- [PATCH v5 10/45] target/arm: Mark string/histo/crypto as non-streaming, Richard Henderson, 2022/07/06
- [PATCH v5 17/45] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL, Richard Henderson, 2022/07/06
- [PATCH v5 20/45] target/arm: Implement SME LD1, ST1, Richard Henderson, 2022/07/06
- [PATCH v5 21/45] target/arm: Export unpredicated ld/st from translate-sve.c, Richard Henderson, 2022/07/06
- [PATCH v5 22/45] target/arm: Implement SME LDR, STR, Richard Henderson, 2022/07/06
- [PATCH v5 23/45] target/arm: Implement SME ADDHA, ADDVA, Richard Henderson, 2022/07/06
- [PATCH v5 18/45] target/arm: Implement SME ZERO, Richard Henderson, 2022/07/06
- [PATCH v5 19/45] target/arm: Implement SME MOVA, Richard Henderson, 2022/07/06
- Re: [PATCH v5 19/45] target/arm: Implement SME MOVA,
Peter Maydell <=
- [PATCH v5 26/45] target/arm: Implement FMOPA, FMOPS (widening), Richard Henderson, 2022/07/06
- [PATCH v5 24/45] target/arm: Implement FMOPA, FMOPS (non-widening), Richard Henderson, 2022/07/06
- [PATCH v5 27/45] target/arm: Implement SME integer outer product, Richard Henderson, 2022/07/06
- [PATCH v5 29/45] target/arm: Implement REVD, Richard Henderson, 2022/07/06
- [PATCH v5 32/45] target/arm: Enable SME for -cpu max, Richard Henderson, 2022/07/06
- [PATCH v5 30/45] target/arm: Implement SCLAMP, UCLAMP, Richard Henderson, 2022/07/06
- [PATCH v5 25/45] target/arm: Implement BFMOPA, BFMOPS, Richard Henderson, 2022/07/06