qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v4 10/10] util/bufferiszero: Add sve acceleration for aar


From: Alex Bennée
Subject: Re: [RFC PATCH v4 10/10] util/bufferiszero: Add sve acceleration for aarch64
Date: Fri, 16 Feb 2024 11:05:24 +0000
User-agent: mu4e 1.11.28; emacs 29.1

Richard Henderson <richard.henderson@linaro.org> writes:

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>
> RFC because I've not benchmarked this on real hw, only run it
> through qemu for validation.
>
<snip>
>  
> +#ifdef CONFIG_SVE_OPT
> +static unsigned accel_index;
> +static void __attribute__((constructor)) init_accel(void)
> +{
> +    accel_index = (cpuinfo & CPUINFO_SVE ? 2 : 1);
> +    buffer_is_zero_accel = accel_table[accel_index];
> +}

This really needs to be:

  -    accel_index = (cpuinfo & CPUINFO_SVE ? 2 : 1);
  +    unsigned info = cpuinfo_init();
  +    accel_index = (info & CPUINFO_SVE ? 2 : 1);

because otherwise you are relying on constructor initialisation order
and on the Graviton 3 I built on it wasn't detecting the SVE. With that I
get this from "perf record ./tests/unit/test-bufferiszero -m thorough"

  51.17%  test-bufferisze  test-bufferiszero      [.] buffer_is_zero_sve
  18.92%  test-bufferisze  test-bufferiszero      [.] buffer_is_zero_simd
  18.02%  test-bufferisze  test-bufferiszero      [.] buffer_is_zero_int_ge256
   7.67%  test-bufferisze  test-bufferiszero      [.] buffer_is_zero_ool
   4.09%  test-bufferisze  test-bufferiszero      [.] test_1

but as I mentioned before it would be nice to have a proper benchmark
for the buffer utils as I'm sure the unit test would be prone to noise.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



reply via email to

[Prev in Thread] Current Thread [Next in Thread]