[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [RFC PATCH v1 2/2] target-arm: Use Neon for zero checking
From: |
Peter Maydell |
Subject: |
Re: [Qemu-arm] [RFC PATCH v1 2/2] target-arm: Use Neon for zero checking |
Date: |
Tue, 5 Apr 2016 17:01:17 +0100 |
On 5 April 2016 at 16:21, Paolo Bonzini <address@hidden> wrote:
> But in theory it should be enough to add a new #elif branch like this:
>
> #include "arm_neon.h"
> #define VECTYPE uint64x2_t
> #define VEC_OR(a, b) ((a) | (b))
> #define ALL_EQ(a, b) /* ??? :) */
#define ALL_EQ(a, b) (vgetq_lane_u64(a, 0) == vgetq_lane_u64(b, 0) && \
vgetq_lane_u64(a, 1) == vgetq_lane_u64(b, 1))
will do I think (probably suboptimal for a true vector compare but
works OK here as we're actually only interested in comparing against
constant zero; the compiler generates "load 64bit value from vector
register to integer; cbnz" for each half of the vector).
Worth benchmarking that (and the variant where we use the C code
but move the loop unrolling out to 16) against the handwritten
intrinsics version.
thanks
-- PMM