coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Bug#854053: coreutils: improve 2x-3x sha256sum performance on ppc64l


From: Gustavo Serra Scalet
Subject: RE: Bug#854053: coreutils: improve 2x-3x sha256sum performance on ppc64le due to current gcc optimization bug
Date: Fri, 3 Feb 2017 18:37:57 +0000


> -----Original Message-----
> From: Michael Stone [mailto:address@hidden]
> Sent: sexta-feira, 3 de fevereiro de 2017 11:38
> To: Gustavo Serra Scalet <address@hidden>
> Cc: address@hidden; address@hidden
> Subject: Re: Bug#854053: coreutils: improve 2x-3x sha256sum performance
> on ppc64le due to current gcc optimization bug
> 
> On Fri, Feb 03, 2017 at 11:22:28AM -0200, Gustavo Serra Scalet wrote:
> >The sha256sum provided by coreutils (without openssl) is performing
> >poorly with versions >= 4.9 until 7.0 (currently under development).
> >The reason for that is the -fschedule-insns optimization that is used
> >with -O2. By simply deactivating it, there is a performance improvement
> >of
> >2 to 3 times.
> >
> >I'm attaching a patch that demonstrate that behavior but it lacks this
> >condition:
> >* If ppc64le
> >* If gcc being used is >= 4.9 and < 7.0
> >
> >Notes:
> >1) gcc-7 is not affected by this bug (verified on 20170129 snapshot).
> >2) clang is not affected by this bug (verified on v3.8 and v3.9).
> >3) strangely the sha512 is not affected by this.
> 
> Sounds good in theory, just needs conditionals. :) For debian we can
> just assume the compiler version and remove the patch when a newer
> compiler is available as a dependency, but making it conditional on ppc
> seems essential. Upstream would probably want some kind of compiler
> based test.

Seems good to me.
 
> My understanding is that just adding this option to cflags
> when building on the appropriate architecture wouldn't work, because it
> would negatively impact other code, is that correct?

Yes, I wouldn't add this CFLAG for the whole package, but only for the 
lib/sha256.o target (that's what my patch is doing).

For that target, I see a better performance without that optimization than with 
it. Normally nobody shuts off optimization to improve performance but, as I 
said, it is a bug that happens on this target and it affects gcc versions >= 
4.9 and < 7.0.

If you see a different way out, or a more neat approach, please advise.

> 
> Mike Stone
> 
> (additional context quoted below)
> 
> >Below a demonstration of how it performs:
> >
> >===================================================
> >$ (./configure && make -j9) > /dev/null && time src/sha256sum
> >~/ubuntu-16.10-server-ppc64el.iso
> >configure: WARNING: libacl development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without ACL support.
> >configure: WARNING: libattr development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without xattr support.
> >configure: WARNING: libcap library was not found or not usable.
> >configure: WARNING: GNU coreutils will be built without capability
> support.
> >configure: WARNING: libgmp development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without GMP support.
> >src/who.c: In function 'print_user':
> >src/who.c:454:20: warning: initialization discards 'const' qualifier
> from pointer target type [-Wdiscarded-qualifiers]
> >         int   *a = utmp_ent->ut_addr_v6;
> >                    ^~~~~~~~
> >d14bdb413ea6cdc8d9354fcbc37a834b7de0c23f992deb0c6764d0fd5d65408e
> >/home/gut/ubuntu-16.10-server-ppc64el.iso
> >
> >real    0m18.670s
> >user    0m16.566s
> >sys     0m0.745s
> >
> >$ # now with the following patch:
> >$ diff Makefile.in ../Makefile.in
> >8989c8989
> >< @am__fastdepCC_TRUE@  $(COMPILE) -MT $@ -MD -MP -MF $$depbase.Tpo -c
> >-o $@ $< &&\
> >---
> >> @am__fastdepCC_TRUE@  $(COMPILE) $$([ "$@" == "lib/sha256.o" ] &&
> >> echo "-fno-schedule-insns") -MT $@ -MD -MP -MF $$depbase.Tpo -c -o $@
> >> $< &&\
> >$ cp ../Makefile.in Makefile.in
> >$ (./configure && make -j9) > /dev/null && time src/sha256sum
> >~/ubuntu-16.10-server-ppc64el.iso
> >configure: WARNING: libacl development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without ACL support.
> >configure: WARNING: libattr development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without xattr support.
> >configure: WARNING: libcap library was not found or not usable.
> >configure: WARNING: GNU coreutils will be built without capability
> support.
> >configure: WARNING: libgmp development library was not found or not
> usable.
> >configure: WARNING: GNU coreutils will be built without GMP support.
> >src/who.c: In function 'print_user':
> >src/who.c:454:20: warning: initialization discards 'const' qualifier
> from pointer target type [-Wdiscarded-qualifiers]
> >         int   *a = utmp_ent->ut_addr_v6;
> >                    ^~~~~~~~
> >d14bdb413ea6cdc8d9354fcbc37a834b7de0c23f992deb0c6764d0fd5d65408e
> >/home/gut/ubuntu-16.10-server-ppc64el.iso
> >
> >real    0m5.903s
> >user    0m5.560s
> >sys     0m0.255s
> 
> >--- Makefile.in      2016-11-30 16:34:55.000000000 -0200
> >+++ ../Makefile.in   2017-02-03 09:33:17.936000000 -0200
> >@@ -8986,7 +8986,7 @@
> >
> > .c.o:
> > @am__fastdepCC_TRUE@        $(AM_V_CC)depbase=`echo $@ | sed
> 's|[^/]*$$|$(DEPDIR)/&|;s|\.o$$||'`;\
> >-@am__fastdepCC_TRUE@        $(COMPILE) -MT $@ -MD -MP -MF $$depbase.Tpo -c
> -o $@ $< &&\
> >+@am__fastdepCC_TRUE@  $(COMPILE) $$([ "$@" == "lib/sha256.o" ] && echo
> >+"-fno-schedule-insns") -MT $@ -MD -MP -MF $$depbase.Tpo -c -o $@ $<
> >+&&\
> > @am__fastdepCC_TRUE@        $(am__mv) $$depbase.Tpo $$depbase.Po
> > @AMDEP_TRUE@@am__fastdepCC_FALSE@   $(AM_V_CC)source='$<' object='$@'
> libtool=no @AMDEPBACKSLASH@
> > @AMDEP_TRUE@@am__fastdepCC_FALSE@   DEPDIR=$(DEPDIR) $(CCDEPMODE)
> $(depcomp) @AMDEPBACKSLASH@




reply via email to

[Prev in Thread] Current Thread [Next in Thread]