guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[address@hidden: Bug#956418: src:glibc: Please provide optimized builds


From: Efraim Flashner
Subject: [address@hidden: Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1]
Date: Sat, 11 Apr 2020 22:21:02 +0300

A copy of a bug report sent to debian-arm. I'll try to keep an eye on it
to see if this is something we want to do for our aarch64 builds. I
don't have any armv-8.1 boards to test this out on.


-- 
Efraim Flashner   <address@hidden>   אפרים פלשנר
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
--- Begin Message --- Subject: Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1 Date: Fri, 10 Apr 2020 13:16:31 -0700 User-agent: Mutt/1.9.4 (2018-02-28)
Package: src:glibc
Version: 2.30-4
Severity: wishlist
X-Debbugs-CC: address@hidden

The ARMv8.1 spec, as implemented by the ARM Neoverse N1 processor,
introduces a set of instructions [1] that result in significant performance
improvements for multithreaded applications.  Sample code demonstrating the
performance improvements is attached.  When run on a 16-core Neoverse N1
host with glibc 2.30-4, runtimes vary significantly, ranging from lows
around 250ms to highs around 15 seconds.  When linked against glibc rebuilt
with support for these instructions, runtimes are consistently <50ms.
Significant performance impact has also been observed in less contrived
cases (MariaDB and Postgres), but I don't have a repro to share.

Gcc provides two ways to enable support for these instructions at build
time.  The simplest, and least disruptive, is to enable -moutline-atomics
globally in the arm64 glibc build.  As described at [2], this option enables
runtime checks for the availability of the atomic instructions.  If found,
they are used, otherwise ARMv8.0 compatible code is used.  The drawback of
this option is that the check happens at runtime, thus introducing some
overhead on all arm64 installations.

The second option is to provide libraries built with explicit support for
the ARM v8.1a spec via the -march=armv8.1-a flag.  This option is also
described at [2].  This build would be incompatible with earlier versions of
the spec, so it would need to be provided in a location where the linker
will automatically discover it if it is usable (e.g.
/lib/aarch64-linux-gnu/atomics/).  This does not incur any runtime overhead,
but obviously involves an additional libc build, and the corresponding
complixity and disk space utilization.  I'm not sure if this is an option
that the glibc maintainers are interested in pursuing.

I've tested both options and found them to be acceptable on v8.1a (Neoverse
N1) and v8a (Cortex A72) CPUs.  I can provide bulk test run data of the
various different configuration permutations if you'd like to see additional
data.

I can provide patches or merge requests implementing either option, at least
for a starting point, if you'd like to see them.

Thanks!
noah

1. https://static.docs.arm.com/ddi0557/a/DDI0557A_b_armv8_1_supplement.pdf
   Section B1
2. https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

Attachment: a.c
Description: Text Data


--- End Message ---

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]