autoconf-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel autotest [2/3]: Implement 'testsuite --jobs'.


From: Ralf Wildenhues
Subject: Re: parallel autotest [2/3]: Implement 'testsuite --jobs'.
Date: Thu, 2 Oct 2008 08:21:16 +0200
User-agent: Mutt/1.5.18 (2008-05-17)

Hi Eric,

* Eric Blake wrote on Tue, Sep 30, 2008 at 04:16:33AM CEST:
> According to Ralf Wildenhues on 9/29/2008 3:35 PM:
> >>> address@hidden address@hidden@address@hidden
> >>> address@hidden address@hidden
> >>> address@hidden -j
> >> This second line is redundant, since you used @ovar on the line above.
> > 
> > The first line addresses --jobs=N, the second line addresses -jN.
> > What is redundant there?  (see below at --help output, for more info)
> 
> I guess I didn't phrase that well.  I meant that the second -j line:
> 
> - -j[N]
> - -j
> 
> looks redundant;

D'oh.  I simply misunderstood you.

> I think it could be better rendered as:
> 
> - --jobs[=N]
> - -jN
> - -j
> 
> in the manual (@var, not @ovar), and

Or just
- --jobs[=N]
- -j[N]

omitting the third line.  That's how I did it now, it's consistent with
how other manuals do it (the make manual for example).

> - -j, --jobs[=N]
> 
> in the --help.  But I can also see why you are using -j[N] in the --help,
> so I could still be swayed to go your way.

Naah, I'm fine with '-j, --jobs[=N]'.  Users will find out, I guess.

> >>> +      . "$at_test_source" # AT_JOB_FIFO_FD<&-
> > 
> > that would be nice to have without the '#' but doesn't work this way?
> 
> I did wonder about that comment.  Closing the fd would mean one less
> leaked fd into the test; what went wrong?

It's not portable IIRC.  Unfortunately, I didn't take a note about this
when testing, and I don't remember any more.  I'll try to retest at some
point.

> > Hmm.  I'm starting to think this whole test is a bit too fragile; if the
> > load is too high (for example because you run ./testsuite -jN), then it
> > can fail sometimes.
> 
> Is there some other sort of test, even at -j2, where we can kick off two
> tests that do some handshaking interleaved by sleeps?  Perhaps not; tests
> have to be independent because the testsuite must gracefully fall back on
> a serialized run on deficient platforms.

That's not really a problem: except maybe for DOS, every system I know
of will let some other process run while 'sleep 1' is waiting for its
completion.  No, it's really high load systems that can make trouble.
And high load is not unusual if you run, say,
  make check TESTSUITEFLAGS=-j3

;-)

> I'm certainly not coming up with any bright ideas, which means this
> feature will not get as much test coverage on the platforms where it is
> most needed.  On the other hand, using a parallel test is so enticing that
> I still think it is worth checking in, even if we can't figure out a
> robust test for it.

OK, here's a hopefully more robust test with 16 individual test groups.
I rewrote the comments to contain a lot of the details mentioned before.
Below is the diff over the previous patch.

Some more testing exposed more issues with the patch:
- './testsuite 42 -j' would produce wrongly formatted output
  (one test, and -j without argument)
- './testsuite -j42 1' would hang (more jobs than tests).

These two are fixed and tested below.

However, there are still races in the code.  I found two separate
issues:
- sometimes, a parallel run would fail to output all test results on
  stdout; they would appear in the log file, however.  This happens
  maybe 20% of the time on one host, but never occurred on another.
- I found one hang (among many many test runs) at the end where the
  master process hangs waiting for a token.

All tested on GNU/Linux systems with bash 3.2.39(1).

Unfortunately, I haven't found the time to get to the bottom of either.
The root of both may be the same: output from a subprocess getting lost,
due to a race in the shell's job management, or a bug in libc or the
kernel.  I don't know.  At least I haven't been able to find a bug in
the shell code yet.

I suspect the code will turn up other, similar issues elsewhere.
At some point we may have to decide whether it's worth to go this way
(at least fixing the open source shells seems like a good idea anyway),
or trash it and think about using 'make'.

Anyway, for now I've committed the resulting patch, so that people can
try it out and play with it.  We can still rip it out later, if it turns
out to be too much hassle, but for now there is little chance of
regressing the non-parallel code path.

The full, combined patch should be available here soon:
<http://lists.gnu.org/archive/html/autoconf-commit/2008-10/msg00000.html>

Cheers,
Ralf

diff --git a/NEWS b/NEWS
index a380cc2..c34bedb 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,9 @@ GNU Autoconf NEWS - User visible changes.
 
 ** AC_LANG_ERLANG works once again (regression introduced in 2.61a).
 
+** Autotest testsuites accept an option --jobs[=N] for parallel testing.
+
+
 * Major changes in Autoconf 2.63 (2008-09-09) [stable]
   Released by Eric Blake, based on git versions 2.62.*.
 
@@ -70,8 +73,6 @@ GNU Autoconf NEWS - User visible changes.
 ** Config header templates `#undef UNDEFINED /* comment */' do not lead to
    nested comments any more; regression introduced in 2.62.
 
-** Autotest testsuites accept an option --jobs[=N] for parallel testing.
-
 
 * Major changes in Autoconf 2.62 (2008-04-05) [stable]
   Released by Eric Blake, based on git versions 2.61a.*.
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 4969475..e515a87 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -20983,7 +20983,6 @@ Makefile.
 
 @item address@hidden@address@hidden
 @itemx address@hidden
address@hidden -j
 Run @var{n} tests in parallel, if possible.  If @var{n} is not given,
 run all given tests in parallel.  Note that there should be no space
 before the argument to @option{-j}, as @option{-j @var{number}} denotes
diff --git a/lib/autotest/general.m4 b/lib/autotest/general.m4
index 12bb396..4683df4 100644
--- a/lib/autotest/general.m4
+++ b/lib/autotest/general.m4
@@ -583,7 +583,8 @@ do
          at_jobs=`expr X$at_option : 'X-j\(.*\)'`
        fi
        case $at_jobs in *[[!0-9]]*)
-         AS_ERROR([non-numeric argument to -j/--jobs: $at_jobs]) ;;
+         at_optname=`echo " $at_option" | sed 's/^ //; s/[[0-9=]].*//'`
+         AS_ERROR([non-numeric argument to $at_optname: $at_jobs]) ;;
        esac
        ;;
 
@@ -691,8 +692,8 @@ dnl extra quoting prevents emacs whitespace mode from 
putting tabs in output
 Execution tuning:
   -C, --directory=DIR
 [                 change to directory DIR before starting]
-  -j[[N]], --jobs[[=N]]
-[                 Allow N jobs at once; infinite jobs with no arg]
+  -j, --jobs[[=N]]
+[                 Allow N jobs at once; infinite jobs with no arg (default 1)]
   -k, --keywords=KEYWORDS
 [                 select the tests matching all the comma-separated KEYWORDS]
 [                 multiple \`-k' accumulate; prefixed \`!' negates a KEYWORD]
@@ -1015,6 +1016,12 @@ BEGIN { FS="" }
   AS_ERROR([cannot create test line number cache])
 rm -f "$at_suite_dir/at-source-lines"
 
+# Set number of jobs for `-j'; avoid more jobs than test groups.
+set X $at_groups; shift; address@hidden:@]
+if test $at_jobs -eq 0 || test $at_jobs -gt $at_max_jobs; then
+  at_jobs=$at_max_jobs
+fi
+
 # If parallel mode, don't output banners, don't split summary lines.
 if test $at_jobs -ne 1; then
   at_print_banners=false
@@ -1194,12 +1201,9 @@ if test $at_jobs -ne 1 &&
 then
   # FIFO job dispatcher.
   echo
-  if test $at_jobs -eq 0; then
-    set X $at_groups; shift; address@hidden:@]
-  fi
   # Turn jobs into a list of numbers, starting from 1.
   at_joblist=`AS_ECHO([" $at_groups_all "]) | \
-    sed -e 's/\( '$at_jobs'\) .*/\1/'`
+    sed 's/\( '$at_jobs'\) .*/\1/'`
 
   set X $at_joblist
   shift
diff --git a/tests/autotest.at b/tests/autotest.at
index f5764ac..a20c4ac 100644
--- a/tests/autotest.at
+++ b/tests/autotest.at
@@ -779,20 +779,50 @@ AT_CLEANUP
 
 AT_SETUP([parallel test execution])
 
-# The total number of tests for the parallel test micro-suite,
-# the number of tests to run concurrently.
+# This test tries to ensure that -j runs tests in parallel.
+# Such a test is inherently racy, because there are no real-time
+# guarantees about scheduling delays.  So we try to minimize
+# the chance to lose the race.
+
+# The time needed for a micro-suite consisting of NTESTS tests each
+# sleeping for a second is estimated by
+#   startup + ntests * (serial_overhead + 1 / njobs)
 #
-# The total number should not be too high, to not slow down
-# the testsuite unnecessarily.  If the number of concurrent
-# jobs is too low, the race is lost more easily (see below),
-# if it is too high, fork may fail on tightly limited systems.
-m4_define([AT_PARALLEL_TESTS_TOTAL], [8])
-m4_define([AT_PARALLEL_TESTS_RUN], [4])
+# in absence of major scheduling delays.  This leads to side conditions:
+# - NTESTS should be high, so the STARTUP time is small compared to the
+#   test run time, and scheduling delays can even out; it should not be
+#   too high, to not slow down the testsuite unnecessarily,
+# - the number of concurrent jobs NJOBS should not be too low, so the
+#   race is not lost so easily; it should not be too high, to avoid fork
+#   failures on tightly limited systems.  4 seems a good compromise
+#   here, considering that Autotest spawns several other processes.
+# - STARTUP is assumed to be the same for parallel and serial runs, so
+#   the latter can estimate the former.
+# - To avoid unportable output from time measurement commands, spawn
+#   both a parallel and a serial testsuite run; check that the former
+#   completes before the latter has completed a fraction SERIAL_NTESTS
+#   of the tests (the serial run is executed in a subdirectory), plus
+#   some additional time to allow for compensation of SERIAL_OVERHEAD.
+# - when adding this time to the serial test execution, an initial delay
+#   SERIAL_DELAY of the serial test helps to avoid unreliable scheduling
+#   due to the startup burst of the suites.
+
+dnl total number of tests.
+m4_define([AT_PARALLEL_NTESTS], [16])
+dnl number of jobs to run in parallel.
+m4_define([AT_PARALLEL_NJOBS], [4])
+dnl number of tests to run serially, as comparison.
+m4_define([AT_PARALLEL_SERIAL_NTESTS],
+  m4_eval(AT_PARALLEL_NTESTS / AT_PARALLEL_NJOBS))
+dnl initial delay of serial run, to compensate for SERIAL_OVERHEAD.
+dnl This corresponds to 0.67 s of overhead per test.
+m4_define([AT_PARALLEL_SERIAL_DELAY],
+  m4_eval((AT_PARALLEL_NTESTS - AT_PARALLEL_SERIAL_NTESTS + 1) * 2 / 3))
 
 
 AT_CHECK_AT_PREP([micro-suite],
 [[AT_INIT([suite to test parallel execution])
-m4_for([count], [1], ]]AT_PARALLEL_TESTS_TOTAL[[, [],
+m4_for([count], [1], ]]AT_PARALLEL_NTESTS[[, [],
    [AT_SETUP([test number count])
     AT_CHECK([sleep 1])
     AT_CLEANUP
@@ -804,30 +834,32 @@ AT_CHECK([$CONFIG_SHELL ./micro-suite -j2foo], [1], [], 
[stderr])
 AT_CHECK([grep 'non-numeric argument' stderr], [], [ignore])
 AT_CHECK([$CONFIG_SHELL ./micro-suite --jobs=foo], [1], [], [stderr])
 AT_CHECK([grep 'non-numeric argument' stderr], [], [ignore])
-AT_CHECK([$CONFIG_SHELL ./micro-suite -j[]AT_PARALLEL_TESTS_RUN], [], [stdout])
+AT_CHECK([$CONFIG_SHELL ./micro-suite -j[]AT_PARALLEL_NJOBS], [], [stdout])
 # Ensure that all tests run, and lines are not split.
-AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [AT_PARALLEL_TESTS_TOTAL
+AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [AT_PARALLEL_NTESTS
+])
+# Running one test with -j should produce correctly formatted output:
+AT_CHECK([$CONFIG_SHELL ./micro-suite -j 3], [], [stdout])
+AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [1
+])
+# Specifying more jobs than tests should not hang:
+AT_CHECK([$CONFIG_SHELL ./micro-suite -j3 3], [], [stdout])
+AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [1
 ])
-
-# Ensure we really are faster than sequential execution:
-# the testsuite should have completed before we kill it.
-# Unfortunately, the return value of wait is unreliable,
-# so we check that kill fails.
-#
-# Since the testsuite startup time can be high, we compare
-# it against a sequential testsuite run in a subdirectory,
-# where only a subset of tests is run.
 
 # The parallel scheduler requires mkfifo to work.
 AT_CHECK([mkfifo fifo || exit 77])
 mkdir serial
-AT_CHECK([$CONFIG_SHELL ./micro-suite --jobs=[]AT_PARALLEL_TESTS_RUN & ]dnl
-         [sleep 3 && ]dnl
-         [cd serial && $CONFIG_SHELL ../micro-suite -3 >/dev/null && ]dnl
+
+# Unfortunately, the return value of wait is unreliable,
+# so we check that kill fails.
+AT_CHECK([$CONFIG_SHELL ./micro-suite --jobs=[]AT_PARALLEL_NJOBS & ]dnl
+         [sleep AT_PARALLEL_SERIAL_DELAY && ]dnl
+         [cd serial && $CONFIG_SHELL ../micro-suite -AT_PARALLEL_SERIAL_NTESTS 
>/dev/null && ]dnl
          [{ kill $! && exit 1; :; }], [], [stdout], [ignore])
-AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [AT_PARALLEL_TESTS_TOTAL
+AT_CHECK([grep -c '^.\{53\}ok' stdout], [], [AT_PARALLEL_NTESTS
 ])
-AT_CHECK([grep 'AT_PARALLEL_TESTS_TOTAL tests' stdout], [], [ignore])
+AT_CHECK([grep 'AT_PARALLEL_NTESTS tests' stdout], [], [ignore])
 
 AT_CLEANUP
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]