Re: [PATCH 13/55] target/arm: Implement MVE VCLZ

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 13/55] target/arm: Implement MVE VCLZ

From:	Richard Henderson
Subject:	Re: [PATCH 13/55] target/arm: Implement MVE VCLZ
Date:	Thu, 10 Jun 2021 07:03:20 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

On 6/10/21 5:40 AM, Peter Maydell wrote:

+#define DO_1OP(OP, ESIZE, TYPE, H, FN)                                  \
+    void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm)         \
+    {                                                                   \
+        TYPE *d = vd, *m = vm;                                          \
+        uint16_t mask = mve_element_mask(env);                          \
+        unsigned e;                                                     \
+        for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {              \
+            TYPE r = FN(m[H(e)]);                                       \
+            uint64_t bytemask = mask_to_bytemask##ESIZE(mask);          \


Why uint64_t and not TYPE?  Or uint32_t?


A later patch adds the mask_to_bytemask8(), so I wanted
a type that was definitely unsigned (so TYPE isn't any good)
and which was definitely big enough for 64 bits.


Hmm.  I was just concerned about the unnecessary type extension involved.

What about changing the interface. Not to return a mask as you do here, but toperform the entire merge operation. E.g.


static uint8_t mergemask1(uint8_t d, uint8_t r, uint16_t mask)
{
    return mask & 1 ? r : d;
}

static uint16_t mergemask2(uint16_t d, uint16_t r, uint16_t mask)
{
    uint16_t bmask = array_whotsit[mask & 3];
    return (d & ~bmask) | (r & bmask);
}

etc.

Or maybe with a pointer argument for D, so that the load+store is done there aswell. In which case you could use QEMU_GENERIC to select the function invoked,instead of using token pasting everywhere. E.g.


static void mergemask_ub(uint8_t *d, uint8_r, uint16_t mask)
{
    if (mask & 1) {
        *d = r;
    }
}

static void mergemask_sb(int8_t *d, int8_r, uint16_t mask)
{
    mergemask_ub((uint8_t *)d, r, mask);
}

static void mergemask_uh(uint16_t *d, uint16_r, uint16_t mask)
{
    uint16_t bmask = array_whotsit[mask & 3];
    *d = (*d & ~bmask) | (r & bmask);
}

...

#define mergemask(D, R, M) \
    QEMU_GENERIC(D, (uint8_t *, mergemask_ub), \
                    (int8_t *,  mergemask_sb), \
                    ... )

BTW, now that we're at minimal gcc 7, I think we can shift to -std=gnu11 sothat we can drop QEMU_GENERIC and just use _Generic, which is much easier toread than the above, and will give better error messages for missing cases.Anyway...


Which takes your boilerplate down to

+#define DO_1OP(OP, ESIZE, TYPE, H, FN)                                  \
+    void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm)         \
+    {                                                                   \
+        TYPE *d = vd, *m = vm;                                          \
+        uint16_t mask = mve_element_mask(env);                          \
+        for (unsigned e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {     \
+            mergemask(&d[H(e)], FN(m[H(e)]), mask);                     \
+        }                                                               \
+        mve_advance_vpt(env);                                           \
+    }


which looks pretty tidy to me.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 15/55] bitops.h: Provide hswap32(), hswap64(), wswap64() swapping operations, (continued)
- [PATCH 15/55] bitops.h: Provide hswap32(), hswap64(), wswap64() swapping operations, Peter Maydell, 2021/06/07
  - Re: [PATCH 15/55] bitops.h: Provide hswap32(), hswap64(), wswap64() swapping operations, Philippe Mathieu-Daudé, 2021/06/08
  - Re: [PATCH 15/55] bitops.h: Provide hswap32(), hswap64(), wswap64() swapping operations, Richard Henderson, 2021/06/08
- [PATCH 14/55] target/arm: Implement MVE VCLS, Peter Maydell, 2021/06/07
  - Re: [PATCH 14/55] target/arm: Implement MVE VCLS, Richard Henderson, 2021/06/08
- [PATCH 16/55] target/arm: Implement MVE VREV16, VREV32, VREV64, Peter Maydell, 2021/06/07
  - Re: [PATCH 16/55] target/arm: Implement MVE VREV16, VREV32, VREV64, Richard Henderson, 2021/06/08
- [PATCH 13/55] target/arm: Implement MVE VCLZ, Peter Maydell, 2021/06/07
  - Re: [PATCH 13/55] target/arm: Implement MVE VCLZ, Richard Henderson, 2021/06/08
    - Re: [PATCH 13/55] target/arm: Implement MVE VCLZ, Peter Maydell, 2021/06/10
    - Re: [PATCH 13/55] target/arm: Implement MVE VCLZ, Richard Henderson <=
- [PATCH 17/55] target/arm: Implement MVE VMVN (register), Peter Maydell, 2021/06/07
  - Re: [PATCH 17/55] target/arm: Implement MVE VMVN (register), Richard Henderson, 2021/06/08
- [PATCH 26/55] target/arm: Implement MVE VABD, Peter Maydell, 2021/06/07
  - Re: [PATCH 26/55] target/arm: Implement MVE VABD, Richard Henderson, 2021/06/08
- [PATCH 21/55] target/arm: Implement MVE VAND, VBIC, VORR, VORN, VEOR, Peter Maydell, 2021/06/07
  - Re: [PATCH 21/55] target/arm: Implement MVE VAND, VBIC, VORR, VORN, VEOR, Richard Henderson, 2021/06/08
- [PATCH 30/55] target/arm: Implement MVE VMLSLDAV, Peter Maydell, 2021/06/07
  - Re: [PATCH 30/55] target/arm: Implement MVE VMLSLDAV, Richard Henderson, 2021/06/08
- [PATCH 36/55] target/arm: Implement MVE VBRSR, Peter Maydell, 2021/06/07
  - Re: [PATCH 36/55] target/arm: Implement MVE VBRSR, Richard Henderson, 2021/06/09

Prev by Date: Re: [PATCH 11/55] target/arm: Implement MVE VLDR/VSTR (non-widening forms)
Next by Date: Re: [PATCH 54/55] target/arm: Implement MVE VADDV
Previous by thread: Re: [PATCH 13/55] target/arm: Implement MVE VCLZ
Next by thread: [PATCH 17/55] target/arm: Implement MVE VMVN (register)
Index(es):
- Date
- Thread