[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v2 28/76] target/riscv: rvv-0.9: update vext_max_elems() for lo
From: |
Richard Henderson |
Subject: |
Re: [RFC v2 28/76] target/riscv: rvv-0.9: update vext_max_elems() for load/store insns |
Date: |
Thu, 30 Jul 2020 05:44:36 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 7/22/20 2:15 AM, frank.chang@sifive.com wrote:
> -static inline uint32_t vext_maxsz(uint32_t desc)
> +static inline uint32_t vext_max_elems(uint32_t desc, uint32_t esz, bool
> is_ldst)
> {
> - return simd_maxsz(desc) << vext_lmul(desc);
> + /*
> + * As simd_desc support at most 256, the max vlen is 512 bits,
> + * so vlen in bytes (vlenb) is encoded as maxsz.
> + */
> + uint32_t vlenb = simd_maxsz(desc);
> +
> + if (is_ldst) {
> + /*
> + * Vector load/store instructions have the EEW encoded
> + * directly in the instructions. The maximum vector size is
> + * calculated with EMUL rather than LMUL.
> + */
> + uint32_t eew = esz << 3;
> + uint32_t sew = vext_sew(desc);
> + float flmul = vext_vflmul(desc);
> + float emul = (float)eew / sew * flmul;
> + uint32_t emul_r = emul < 1 ? 1 : emul;
> + return vlenb * emul_r / esz;
> + } else {
> + /* Return VLMAX */
> + return vlenb * vext_vflmul(desc) / esz;
> + }
> }
We do not want to be doing all of this arithmetic at runtime. We want to be
doing it at translation time and pass the result to the helper.
If we must do any arithmetic at runtime, we would very much prefer to pass
log2(esz) so that we can use shifts instead of full integer division.
We really really want to avoid a bunch of floating-point conversions and
operations.
If you need to adjust the vext descriptor to make this happen, do so. Do not
feel that load/store needs to pass the *same* descriptor to the helpers as
everything else.
r~
- [RFC v2 21/76] target/riscv: rvv-0.9: configure instructions, (continued)
- [RFC v2 21/76] target/riscv: rvv-0.9: configure instructions, frank . chang, 2020/07/22
- [RFC v2 22/76] target/riscv: rvv-0.9: stride load and store instructions, frank . chang, 2020/07/22
- [RFC v2 23/76] target/riscv: rvv-0.9: index load and store instructions, frank . chang, 2020/07/22
- [RFC v2 24/76] target/riscv: rvv-0.9: fix address index overflow bug of indexed load/store insns, frank . chang, 2020/07/22
- [RFC v2 25/76] target/riscv: rvv-0.9: fault-only-first unit stride load, frank . chang, 2020/07/22
- [RFC v2 26/76] target/riscv: rvv-0.9: amo operations, frank . chang, 2020/07/22
- [RFC v2 27/76] target/riscv: rvv-0.9: load/store whole register instructions, frank . chang, 2020/07/22
- [RFC v2 28/76] target/riscv: rvv-0.9: update vext_max_elems() for load/store insns, frank . chang, 2020/07/22
- Re: [RFC v2 28/76] target/riscv: rvv-0.9: update vext_max_elems() for load/store insns,
Richard Henderson <=
- [RFC v2 29/76] target/riscv: rvv-0.9: take fractional LMUL into vector max elements calculation, frank . chang, 2020/07/22
- [RFC v2 30/76] target/riscv: rvv-0.9: floating-point square-root instruction, frank . chang, 2020/07/22
- [RFC v2 31/76] target/riscv: rvv-0.9: floating-point classify instructions, frank . chang, 2020/07/22
- [RFC v2 32/76] target/riscv: rvv-0.9: mask population count instruction, frank . chang, 2020/07/22
- [RFC v2 33/76] target/riscv: rvv-0.9: find-first-set mask bit instruction, frank . chang, 2020/07/22