qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for lo


From: Frank Chang
Subject: Re: [RFC v3 26/71] target/riscv: rvv-1.0: update vext_max_elems() for load/store insns
Date: Sat, 15 Aug 2020 10:52:04 +0800

On Sat, Aug 15, 2020 at 2:36 AM Richard Henderson <richard.henderson@linaro.org> wrote:
On 8/13/20 7:48 PM, Frank Chang wrote:
> esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro:
>
>> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN)        \
>> void HELPER(NAME)(void *vd, void * v0, target_ulong base,  \
>>                   target_ulong stride, CPURISCVState *env, \
>>                   uint32_t desc)                           \
>> {                                                          \
>>     uint32_t vm = vext_vm(desc);                           \
>>     vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \
>>                      sizeof(ETYPE), GETPC(), MMU_DATA_LOAD);       \
>> }
>>
>> GEN_VEXT_LD_STRIDE(vlse8_v,  int8_t,  lde_b)
>
> which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4, 8.
> and vext_max_elems() is called by e.g. vext_ldst_stride():

Ah, yes.

>> uint32_t max_elems = vext_max_elems(desc, esz);
>
> I can add another parameter to the macro and pass the hard-coded log2(esz) number
> if it's the better way instead of using ctzl().
> Or if there's another approach to get the log2(esz) number more elegantly?

Using ctzl(sizeof(type)) in the GEN_VEXT_LD_STRIDE macro will work well.  This
will be constant folded by the compiler.


r~

Checked the codes again,
GEN_VEXT_LD_STRIDE() will eventually call vext_ldst_stride() and pass esz as the parameter.
However, esz is not only used in vext_max_elems() but also used for other calculation, e.g.:

    probe_pages(env, base + stride * i, nf * esz, ra, access_type);
and
    target_ulong addr = base + stride * i + k * esz;

If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(),
I would still have to do: (1 << esz) to get the correct element size in the above calculations.
Would it eliminate the performance gain we have in vext_max_elems() instead?

Frank Chang

reply via email to

[Prev in Thread] Current Thread [Next in Thread]