[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] circular buffer + hash for jobs.c:bgpids
From: |
John Fremlin |
Subject: |
Re: [PATCH] circular buffer + hash for jobs.c:bgpids |
Date: |
Fri, 17 Apr 2015 20:55:45 +0000 |
User-agent: |
Microsoft-MacOutlook/14.4.8.150116 |
Did some benchmarks, for the while true; do (:) & (:); done simple example
this goes from 215 to 313 iterations/s, and changes sys+user CPU from 152%
to 45%
Any long running bash script will tend to exhibit this issue --
On 4/15/15, 5:59 PM, "John Fremlin" <john@fb.com> wrote:
>Over time, a long running bash process with ulimit u set high (e.g.
>100k) will gradually use more and more CPU. My last patch does not
>actually fix the problem in all cases, just stores fewer things in this
>structure.
>
>This second patch changes the bpgids structure to use a hash table
>pointing to a contiguous circular buffer and frees it after fork.
>
>To see the slowdown and improvement set ulimit -u 30000 (many production
>systems have this over 100k) and run something like
>
>while true; do (:) & (:); done
>
>And look at top.
>
>Alternatively, this command shows it clearly
>
>/usr/bin/time -p bash -c 'start=$(date +%s); end=$(($start+2000));
>now=$start; while test $now -le $end; do count=0; while test $(date +%s)
>= $now; do (:) & (:); count=$((count+1)); done; now=$(date +%s); echo
>$(($now - $start)) $count; done'
>
>The output is in both cases real 2000
>
>With patch it is dominated by copying page table entries on fork
>
>user 94.14
>sys 657.74
>
>Without patch most time is spent in bgp_* functions
>
>user 1637.16
>sys 1337.58
>
>Number of iterations of this busy loop is much higher with the patch too
>:)
Here is some benchmark data over 10k seconds (two+ hours)
> circular_total
real user kernel iterations
1 10000.15 669.06 4227.53 3147086
2 10000.97 620.34 4233.99 3121982
3 10000.98 678.59 4348.51 3111725
4 10000.97 460.06 3479.10 3177468
5 10000.98 488.54 3712.60 3145473
6 10000.98 463.15 3489.42 3174081
7 10000.97 459.57 3489.19 3162874
8 10000.98 482.22 3628.21 3148002
9 10000.98 686.72 4134.55 2949696
10 10000.98 680.75 4254.89 3134325
11 10000.98 678.37 4224.12 3143196
> unpatched_total
real user kernel iterations
1 10000.98 8594.61 6584.06 2142822
2 10000.98 8596.48 6565.17 2142218
3 10000.97 8467.03 6567.20 2132762
4 10000.97 8674.81 6574.01 2161381
5 10000.98 8670.63 6560.84 2158351
6 10000.98 8646.22 6555.38 2161951
7 10000.97 8631.14 6563.89 2150517
8 10000.97 8735.51 6525.91 2156905
9 10000.92 8748.48 6472.00 2165005
10 10000.97 8748.66 6498.08 2159130
Note that over time the asymptotic slowdown (as time against iterations)
is now 10% of what it was
> unpatched_after_start <- unpatched_long[which(unpatched_long$V1 > 2000),]
> lm(unpatched_after_start$V2 ~ unpatched_after_start$V1)
Call:
lm(formula = unpatched_after_start$V2 ~ unpatched_after_start$V1)
Coefficients:
(Intercept) unpatched_after_start$V1
223.087392 -0.001823
> circular_after_start <- circular_long[which(circular_long$V1 > 2000),]
> lm(circular_after_start$V2 ~ circular_after_start$V1)
Call:
lm(formula = circular_after_start$V2 ~ circular_after_start$V1)
Coefficients:
(Intercept) circular_after_start$V1
3.068e+02 -1.219e-04