Re: [PATCH v3 25/51] target/arm: Implement SME MOVA

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 25/51] target/arm: Implement SME MOVA

From:	Peter Maydell
Subject:	Re: [PATCH v3 25/51] target/arm: Implement SME MOVA
Date:	Thu, 23 Jun 2022 12:24:48 +0100

On Mon, 20 Jun 2022 at 19:20, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We can reuse the SVE functions for implementing moves to/from
> horizontal tile slices, but we need new ones for moves to/from
> vertical tile slices.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/helper-sme.h    |  11 ++++
>  target/arm/helper-sve.h    |   2 +
>  target/arm/translate-a64.h |   9 +++
>  target/arm/translate.h     |   5 ++
>  target/arm/sme.decode      |  15 +++++
>  target/arm/sme_helper.c    | 110 ++++++++++++++++++++++++++++++++++++-
>  target/arm/sve_helper.c    |  12 ++++
>  target/arm/translate-a64.c |  19 +++++++
>  target/arm/translate-sme.c | 105 +++++++++++++++++++++++++++++++++++
>  9 files changed, 287 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h
> index c4ee1f09e4..600346e08c 100644
> --- a/target/arm/helper-sme.h
> +++ b/target/arm/helper-sme.h
> @@ -21,3 +21,14 @@ DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, 
> env, i32)
>  DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32)
>
>  DEF_HELPER_FLAGS_3(sme_zero, TCG_CALL_NO_RWG, void, env, i32, i32)
> +
> +DEF_HELPER_FLAGS_4(sme_mova_avz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
> +DEF_HELPER_FLAGS_4(sme_mova_zav_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)

What do the 'avz' and 'zav' stand for here? I thought that
'zav' might mean "from the ZA storage to a Vector", but
then what is 'avz' ?


> +static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs,
> +                                int tile_index, bool vertical)
> +{
> +    int tile = tile_index >> (4 - esz);
> +    int index = esz == MO_128 ? 0 : extract32(tile_index, 0, 4 - esz);
> +    int pos, len, offset;
> +    TCGv_i32 t_index;
> +    TCGv_ptr addr;
> +
> +    /* Resolve tile.size[index] to an untyped ZA slice index. */
> +    t_index = tcg_temp_new_i32();
> +    tcg_gen_trunc_tl_i32(t_index, cpu_reg(s, rs));
> +    tcg_gen_addi_i32(t_index, t_index, index);
> +
> +    len = ctz32(s->svl) - esz;
> +    pos = esz;
> +    offset = tile;
> +
> +    /*
> +     * Horizontal slice.  Index row N, column 0.
> +     * The helper will iterate by the element size.
> +     */
> +    if (!vertical) {
> +        pos += ctz32(sizeof(ARMVectorReg));
> +        offset *= sizeof(ARMVectorReg);
> +    }
> +    offset += offsetof(CPUARMState, zarray);
> +
> +    tcg_gen_deposit_z_i32(t_index, t_index, pos, len);
> +    tcg_gen_addi_i32(t_index, t_index, offset);
> +
> +    /*
> +     * Vertical tile slice.  Index row 0, column N.
> +     * The helper will iterate by the row spacing in the array.
> +     * Need to adjust addressing for elements smaller than uint64_t for BE.
> +     */
> +    if (HOST_BIG_ENDIAN && vertical && esz < MO_64) {
> +        tcg_gen_xori_i32(t_index, t_index, 8 - (1 << esz));
> +    }
> +
> +    addr = tcg_temp_new_ptr();
> +    tcg_gen_ext_i32_ptr(addr, t_index);
> +    tcg_temp_free_i32(t_index);
> +    tcg_gen_add_ptr(addr, addr, cpu_env);
> +
> +    return addr;
> +}

This is too confusing -- I spent half an hour looking at it and
couldn't figure out if it was correct or not. I can see roughly
what it's supposed to be doing but I don't really want to try
to reverse engineer the details from the sequence of operations.
Eg the way we sometimes just add in the tile number and sometimes
add in the tile number * the size of a vector reg looks very
strange; I figured out that the deposit op is doing the equivalent
of the pseudocode's "MOD dim" on the slice index but it doesn't
say so and the calculation of len and pos is kind of obscure to me.

Perhaps (a) more commentary and (b) separating out the
horizontal and vertical cases would help ?

thanks
-- PMM

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v3 30/51] target/arm: Implement FMOPA, FMOPS (non-widening), (continued)
- [PATCH v3 38/51] target/arm: Enable SME for -cpu max, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 38/51] target/arm: Enable SME for -cpu max, Peter Maydell, 2022/06/24
- [PATCH v3 21/51] target/arm: Add infrastructure for disas_sme, Richard Henderson, 2022/06/20
- [PATCH v3 23/51] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 23/51] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL, Peter Maydell, 2022/06/21
    - Re: [PATCH v3 23/51] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL, Richard Henderson, 2022/06/21
    - Re: [PATCH v3 23/51] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL, Peter Maydell, 2022/06/23
- [PATCH v3 25/51] target/arm: Implement SME MOVA, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 25/51] target/arm: Implement SME MOVA, Peter Maydell <=
    - Re: [PATCH v3 25/51] target/arm: Implement SME MOVA, Richard Henderson, 2022/06/23
- [PATCH v3 24/51] target/arm: Implement SME ZERO, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 24/51] target/arm: Implement SME ZERO, Peter Maydell, 2022/06/21
- [PATCH v3 26/51] target/arm: Implement SME LD1, ST1, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 26/51] target/arm: Implement SME LD1, ST1, Peter Maydell, 2022/06/23
    - Re: [PATCH v3 26/51] target/arm: Implement SME LD1, ST1, Richard Henderson, 2022/06/23
    - Re: [PATCH v3 26/51] target/arm: Implement SME LD1, ST1, Peter Maydell, 2022/06/24
- [PATCH v3 27/51] target/arm: Export unpredicated ld/st from translate-sve.c, Richard Henderson, 2022/06/20
  - Re: [PATCH v3 27/51] target/arm: Export unpredicated ld/st from translate-sve.c, Peter Maydell, 2022/06/23
- [PATCH v3 15/51] target/arm: Move arm_cpu_*_finalize to internals.h, Richard Henderson, 2022/06/20

Prev by Date: Re: [PATCH 0/9] Add Qualcomm BMC machines
Next by Date: Re: [PATCH v3 26/51] target/arm: Implement SME LD1, ST1
Previous by thread: [PATCH v3 25/51] target/arm: Implement SME MOVA
Next by thread: Re: [PATCH v3 25/51] target/arm: Implement SME MOVA
Index(es):
- Date
- Thread