[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parallelization of shell scripts for 'configure' etc.
From: |
Alex Ameen |
Subject: |
Re: Parallelization of shell scripts for 'configure' etc. |
Date: |
Thu, 16 Jun 2022 21:44:27 -0500 |
Python and Perl carry a massive dependency closure; notably this closure
depends on `autoconf` itself, so "using Python or Perl in `autoconf`"
creates a large, very large, bootstrap paradox; BUT projects that aren't
members of the Perl/Python closure could take advantage of those tools.
We had an issue with a bootstrap paradox like this in `libtool` recently
with the `file(1) ` command that caused issues. Not irreconcilable, but
distro maintainers were understandably concerned about the impact this had
on reproducibility.
On Thu, Jun 16, 2022, 6:08 PM Demi Marie Obenour <demiobenour@gmail.com>
wrote:
> On 6/14/22 16:36, Richard Purdie wrote:
> > On Tue, 2022-06-14 at 13:11 -0400, Nick Bowler wrote:
> >> The resulting config.h is correct but pa.sh took almost 1 minute to run
> >> the configure script, about ten times longer than dash takes to run the
> >> same script. More than half of that time appears to be spent just
> >> loading the program into pa.sh, before a single shell command is
> >> actually executed.
> >
> > Thanks for sharing that, it saves me looking into it!
> >
> > I work on a cross compiling build environment (Yocto Project) and we
> > find that a large percentage of our build times (20%?) are in the
> > configure stage, either running autoreconf or configure with a 50/50
> > split between the two. We autoreconf since we change the macros in some
> > cases, e.g. libtool.
> >
> > I would love to find a way to be more efficient about this part of our
> > builds. We do already provide some cached values for some macros to try
> > and be a little more efficient.
> >
> > When I've profiled things, most of the time seems to be "fork" overhead
> > of builds having to fork new processes to run shell command pipelines.
> > I have sometimes wondered if we couldn't make code which was more
> > optimised to the common case and didn't have so much forking going on.
>
> I wonder if one could implement a shell that only created a new
> process when it absolutely had to, and which implemented many of the
> common text processing tools as builtin commands. Subshells would
> be implemented via user-level copy-on-write, rather than relying on
> OS support for fork().
>
> Another approach would be to generate Python or Perl scripts
> in addition to shell scripts, allowing the use of the respective
> interpreters when available. In my experience that is basically all
> the time.
>
> Finally, a small but probably noticable improvement would come
> from dropping support for ancient platforms, such as Ultrix. A much
> bigger win would be to use Bash or Zsh if they are installed, as that
> allows using modern shell tricks (such as [[ "$a" =~ [0-9]+ ]] and
> "${a//a/b}") that do not require forking new processes.
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
- Re: Parallelization of shell scripts for 'configure' etc., (continued)
Re: Parallelization of shell scripts for 'configure' etc., Alex Ameen, 2022/06/13
Re: Parallelization of shell scripts for 'configure' etc., Michael Orlitzky, 2022/06/14
Re: Parallelization of shell scripts for 'configure' etc., Paul Eggert, 2022/06/14
Re: Parallelization of shell scripts for 'configure' etc., Bruno Haible, 2022/06/15
Re: Parallelization of shell scripts for 'configure' etc., Warren Young, 2022/06/16
Re: Parallelization of shell scripts for 'configure' etc., Zack Weinberg, 2022/06/16
Re: Parallelization of shell scripts for 'configure' etc., madmurphy, 2022/06/16
Re: Parallelization of shell scripts for 'configure' etc., Chet Ramey, 2022/06/14
Re: Parallelization of shell scripts for 'configure' etc., L A Walsh, 2022/06/18
Re: Parallelization of shell scripts for 'configure' etc., Tim Rühsen, 2022/06/18