qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 13/55] target/arm: Implement MVE VCLZ


From: Peter Maydell
Subject: Re: [PATCH 13/55] target/arm: Implement MVE VCLZ
Date: Thu, 10 Jun 2021 13:40:02 +0100

On Tue, 8 Jun 2021 at 23:10, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 6/7/21 9:57 AM, Peter Maydell wrote:
> > Implement the MVE VCLZ insn (and the necessary machinery
> > for MVE 1-input vector ops).
> >
> > Note that for non-load instructions predication is always performed
> > at a byte level granularity regardless of element size (R_ZLSJ),
> > and so the masking logic here differs from that used in the VLDR
> > and VSTR helpers.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> > +
> > +/*
> > + * Take the bottom bits of mask (which is 1 bit per lane) and
> > + * convert to a mask which has 1s in each byte which is predicated.
> > + */
> > +static uint8_t mask_to_bytemask1(uint16_t mask)
> > +{
> > +    return (mask & 1) ? 0xff : 0;
> > +}
> > +
> > +static uint16_t mask_to_bytemask2(uint16_t mask)
> > +{
> > +    static const uint16_t masks[] = { 0x0000, 0x00ff, 0xff00, 0xffff };
> > +    return masks[mask & 3];
> > +}
> > +
> > +static uint32_t mask_to_bytemask4(uint16_t mask)
> > +{
> > +    static const uint32_t masks[] = {
> > +        0x00000000, 0x000000ff, 0x0000ff00, 0x0000ffff,
> > +        0x00ff0000, 0x00ff00ff, 0x00ffff00, 0x00ffffff,
> > +        0xff000000, 0xff0000ff, 0xff00ff00, 0xff00ffff,
> > +        0xffff0000, 0xffff00ff, 0xffffff00, 0xffffffff,
> > +    };
>
> I'll note that
>
> (1) the values for the mask_to_bytemask2 array overlap the first 4 values of
> the mask_to_bytemask4 array, and
>
> (2) both of these overlap with the larger
>
> static inline uint64_t expand_pred_b(uint8_t byte)
>
> from SVE.  It'd be nice to share the storage, whatever the actual functional
> interface into the array.

Yeah, I guess so. I didn't really feel like trying to
abstract that out...

> > +#define DO_1OP(OP, ESIZE, TYPE, H, FN)                                  \
> > +    void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm)         \
> > +    {                                                                   \
> > +        TYPE *d = vd, *m = vm;                                          \
> > +        uint16_t mask = mve_element_mask(env);                          \
> > +        unsigned e;                                                     \
> > +        for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {              \
> > +            TYPE r = FN(m[H(e)]);                                       \
> > +            uint64_t bytemask = mask_to_bytemask##ESIZE(mask);          \
>
> Why uint64_t and not TYPE?  Or uint32_t?

A later patch adds the mask_to_bytemask8(), so I wanted
a type that was definitely unsigned (so TYPE isn't any good)
and which was definitely big enough for 64 bits.

> > +    if (!mve_eci_check(s)) {
> > +        return true;
> > +    }
> > +
> > +    if (!vfp_access_check(s)) {
> > +        return true;
> > +    }
>
> Not the first instance, but is it worth saving 4 lines per and combining these
> into one IF?

Yes, I think so.

-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]