[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [PATCH v3 1/1] target-arm: Use Neon for zero checking
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-arm] [PATCH v3 1/1] target-arm: Use Neon for zero checking |
Date: |
Wed, 29 Jun 2016 14:53:36 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 |
On 29/06/2016 10:47, address@hidden wrote:
> From: Vijay <address@hidden>
>
> Use Neon instructions to perform zero checking of
> buffer. This is helps in reducing total migration time.
>
> Use case: Idle VM live migration with 4 VCPUS and 8GB ram
> running CentOS 7.
>
> Without Neon, the Total migration time is 3.5 Sec
>
> Migration status: completed
> total time: 3560 milliseconds
> downtime: 33 milliseconds
> setup: 5 milliseconds
> transferred ram: 297907 kbytes
> throughput: 685.76 mbps
> remaining ram: 0 kbytes
> total ram: 8519872 kbytes
> duplicate: 2062760 pages
> skipped: 0 pages
> normal: 69808 pages
> normal bytes: 279232 kbytes
> dirty sync count: 3
>
> With Neon, the total migration time is 2.9 Sec
>
> Migration status: completed
> total time: 2960 milliseconds
> downtime: 65 milliseconds
> setup: 4 milliseconds
> transferred ram: 299869 kbytes
> throughput: 830.19 mbps
> remaining ram: 0 kbytes
> total ram: 8519872 kbytes
> duplicate: 2064313 pages
> skipped: 0 pages
> normal: 70294 pages
> normal bytes: 281176 kbytes
> dirty sync count: 3
>
> Signed-off-by: Vijaya Kumar K <address@hidden>
> Signed-off-by: Suresh <address@hidden>
> ---
> util/cutils.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/util/cutils.c b/util/cutils.c
> index 5830a68..4779403 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -184,6 +184,13 @@ int qemu_fdatasync(int fd)
> #define SPLAT(p) _mm_set1_epi8(*(p))
> #define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
> #define VEC_OR(v1, v2) (_mm_or_si128(v1, v2))
> +#elif __aarch64__
> +#include "arm_neon.h"
> +#define VECTYPE uint64x2_t
> +#define ALL_EQ(v1, v2) \
> + ((vgetq_lane_u64(v1, 0) == vgetq_lane_u64(v2, 0)) && \
> + (vgetq_lane_u64(v1, 1) == vgetq_lane_u64(v2, 1)))
> +#define VEC_OR(v1, v2) ((v1) | (v2))
> #else
> #define VECTYPE unsigned long
> #define SPLAT(p) (*(p) * (~0UL / 255))
>
Acked-by: Paolo Bonzini <address@hidden>