[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v2 58/76] target/riscv: rvv-0.9: slide instructions
From: |
Richard Henderson |
Subject: |
Re: [RFC v2 58/76] target/riscv: rvv-0.9: slide instructions |
Date: |
Fri, 31 Jul 2020 08:57:24 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 7/22/20 2:16 AM, frank.chang@sifive.com wrote:
> -#define GEN_VEXT_VSLIDEDOWN_VX(NAME, ETYPE, H, CLEAR_FN) \
> +#define GEN_VEXT_VSLIDEDOWN_VX(NAME, ETYPE, H) \
> void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \
> CPURISCVState *env, uint32_t desc) \
> { \
> - uint32_t vlmax = env_archcpu(env)->cfg.vlen; \
> + uint32_t vlmax = vext_max_elems(desc, sizeof(ETYPE), false); \
> uint32_t vm = vext_vm(desc); \
> - uint32_t vta = vext_vta(desc); \
> uint32_t vl = env->vl; \
> target_ulong offset = s1, i; \
> \
> for (i = 0; i < vl; ++i) { \
> + /* offset may be a large value, which j may overflow */ \
> target_ulong j = i + offset; \
> + bool is_valid = (offset >= vlmax || j >= vlmax) ? false : true; \
This is... silly verbose.
But also, the test is partially loop invariant and entirely predictable,
allowing loop fission.
> if (!vm && !vext_elem_mask(v0, i)) { \
> continue; \
> } \
> - *((ETYPE *)vd + H(i)) = j >= vlmax ? 0 : *((ETYPE *)vs2 + H(j)); \
> + *((ETYPE *)vd + H(i)) = is_valid ? *((ETYPE *)vs2 + H(j)) : 0; \
> } \
> - CLEAR_FN(vd, vta, vl, vl * sizeof(ETYPE), vlmax * sizeof(ETYPE)); \
> }
E.g.
i_max = s1 < vlmax ? vlmax - s1 : 0;
for (i = 0; i < i_max; ++i) {
if (vext_elem_mask(v0, i)) {
*((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i + s1));
}
}
for (i = i_max; i < vl; ++i) {
if (vext_elem_mask(v0, i)) {
*((ETYPE *)vd + H(i)) = 0;
}
}
r~
- [RFC v2 52/76] fpu: implement full set compare for fp16, (continued)
- [RFC v2 52/76] fpu: implement full set compare for fp16, frank . chang, 2020/07/22
- [RFC v2 53/76] target/riscv: use softfloat lib float16 comparison functions, frank . chang, 2020/07/22
- [RFC v2 54/76] target/riscv: rvv-0.9: floating-point compare instructions, frank . chang, 2020/07/22
- [RFC v2 55/76] target/riscv: rvv-0.9: single-width integer reduction instructions, frank . chang, 2020/07/22
- [RFC v2 56/76] target/riscv: rvv-0.9: widening integer reduction instructions, frank . chang, 2020/07/22
- [RFC v2 57/76] target/riscv: rvv-0.9: mask-register logical instructions, frank . chang, 2020/07/22
- [RFC v2 58/76] target/riscv: rvv-0.9: slide instructions, frank . chang, 2020/07/22
- Re: [RFC v2 58/76] target/riscv: rvv-0.9: slide instructions,
Richard Henderson <=
- [RFC v2 59/76] target/riscv: rvv-0.9: floating-point slide instructions, frank . chang, 2020/07/22
- [RFC v2 60/76] target/riscv: rvv-0.9: narrowing fixed-point clip instructions, frank . chang, 2020/07/22
- [RFC v2 61/76] target/riscv: rvv-0.9: floating-point/integer type-convert instructions, frank . chang, 2020/07/22
- [RFC v2 62/76] target/riscv: rvv-0.9: single-width floating-point reduction, frank . chang, 2020/07/22
- [RFC v2 63/76] target/riscv: rvv-0.9: widening floating-point reduction instructions, frank . chang, 2020/07/22
- [RFC v2 64/76] target/riscv: rvv-0.9: single-width scaling shift instructions, frank . chang, 2020/07/22