qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 32/55] target/arm: Implement MVE VRMLALDAVH, VRMLSLDAVH


From: Peter Maydell
Subject: Re: [PATCH 32/55] target/arm: Implement MVE VRMLALDAVH, VRMLSLDAVH
Date: Mon, 14 Jun 2021 11:19:43 +0100

On Wed, 9 Jun 2021 at 02:05, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 6/7/21 9:57 AM, Peter Maydell wrote:
> > +#define DO_LDAVH(OP, ESIZE, TYPE, H, XCHG, EVENACC, ODDACC, TO128)      \
> > +    uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn,         \
> > +                                    void *vm, uint64_t a)               \
> > +    {                                                                   \
> > +        uint16_t mask = mve_element_mask(env);                          \
> > +        unsigned e;                                                     \
> > +        TYPE *n = vn, *m = vm;                                          \
> > +        Int128 acc = TO128(a);                                          \
>
> This seems to miss the << 8.

Oops, yes it does.

> Which suggests that the whole thing can be done without Int128:
>
> > +        for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {              \
> > +            if (mask & 1) {                                             \
> > +                if (e & 1) {                                            \
> > +                    acc = ODDACC(acc, TO128(n[H(e - 1 * XCHG)] * 
> > m[H(e)])); \
>
>    tmp = n * m;
>    tmp = (tmp >> 8) + ((tmp >> 7) & 1);
>    acc ODDACC tmp;

I'm not sure about this suggestion though. It throws away all
of the bottom 7 bits of the product, but because we're iterating through
this 4 times and adding (potentially) four of these products together,
those bottom 7 bits in the 4 products might be able to add together
to become significant enough to affect the final result.

-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]