[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: false-positive failure of the root-removal test
From: |
Jim Meyering |
Subject: |
Re: false-positive failure of the root-removal test |
Date: |
Wed, 14 Oct 2015 11:40:01 -0700 |
On Wed, Oct 14, 2015 at 10:43 AM, Jim Meyering <address@hidden> wrote:
> Running a massively parallel "make very-expensive-check"
> (-j73 on a 48-core system), the rm/r-root.sh test would fail
> about 1-in-2 or 1-in-3 trials due to expiration of the 2-second
> timeout here:
>
> diff --git a/tests/rm/r-root.sh b/tests/rm/r-root.sh
> index c06332a..4e645e6 100755
> --- a/tests/rm/r-root.sh
> +++ b/tests/rm/r-root.sh
> @@ -88,7 +88,7 @@ exercise_rm_r_root ()
> skip_exit='CU_TEST_SKIP_EXIT=1'
> fi
>
> - timeout --signal=KILL 2 \
> + timeout --signal=KILL 5 \
> env LD_PRELOAD=$LD_PRELOAD:./k.so $skip_exit \
> rm -rv --one-file-system "$@" > out 2> err
>
> I made the above change and observed that the whole test then
> succeeded 6 times in a row. Then I read the comment above that change:
>
> # exercise_rm_r_root: shell function to test "rm -r '/'"
> # The caller must provide the FILE to remove as well as any options
> # which should be passed to 'rm'.
> # Paranoia mode on:
> # For the worst case where both rm(1) would fail to refuse to process the "/"
> # argument (in the cases without the --no-preserve-root option), and
> # intercepting the unlinkat(1) system call would fail (which actually already
> # has been proven to work above), and the current non root user has
> # write access to "/", limit the damage to the current file system via
> # the --one-file-system option.
> # Furthermore, run rm(1) via timeout(1) that kills that process after
> # a maximum of 2 seconds.
>
> So maybe compromise at 3 seconds (with that, it's passed 4 times so far)?
> Probably better still: I'll remember this and decrease -j's argument from
> 1+3N/2 to something slightly less abusive.
FYI, while trying to confirm that "3" is sufficient, I hit another failure,
but now in another race-susceptible test:
+ diff -u exp out
--- exp 2015-10-14 11:26:05.424685178 -0700
+++ out 2015-10-14 11:26:05.424685178 -0700
@@ -1 +0,0 @@
-line
+ fail=1
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ remove_tmp_
+ __st=1
+ cleanup_
+ :
+ cd /data/users/meyering/w/co/cu
+ chmod -R u+rwx /data/users/meyering/w/co/cu/gt-follow-stdin.sh.y0sA
+ rm -rf /data/users/meyering/w/co/cu/gt-follow-stdin.sh.y0sA
+ exit 1
FAIL tests/tail-2/follow-stdin.sh (exit status: 1)
So I'll just remember to use reduced parallelism for this task.
- false-positive failure of the root-removal test, Jim Meyering, 2015/10/14
- Re: false-positive failure of the root-removal test,
Jim Meyering <=
- Re: false-positive failure of the root-removal test, Pádraig Brady, 2015/10/15
- Re: false-positive failure of the root-removal test, Bernhard Voelker, 2015/10/15
- Re: false-positive failure of the root-removal test, Pádraig Brady, 2015/10/15
- Re: false-positive failure of the root-removal test, Bernhard Voelker, 2015/10/15
- Re: false-positive failure of the root-removal test, Pádraig Brady, 2015/10/15
- Re: false-positive failure of the root-removal test, Jim Meyering, 2015/10/15
- Re: false-positive failure of the root-removal test, Pádraig Brady, 2015/10/15
- Re: false-positive failure of the root-removal test, Pádraig Brady, 2015/10/15
- Re: false-positive failure of the root-removal test, Jim Meyering, 2015/10/15