Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization

From:	Christian Schoenebeck
Subject:	Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization
Date:	Thu, 23 Jan 2020 13:57:23 +0100

On Donnerstag, 23. Januar 2020 12:33:42 CET Greg Kurz wrote:
> On Tue, 21 Jan 2020 01:30:10 +0100
> 
> Christian Schoenebeck <address@hidden> wrote:
> > Make top half really top half and bottom half really bottom half:
> > 
> > Each T_readdir request handling is hopping between threads (main
> > I/O thread and background I/O driver threads) several times for
> > every individual directory entry, which sums up to huge latencies
> > for handling just a single T_readdir request.
> > 
> > Instead of doing that, collect now all required directory entries
> > (including all potentially required stat buffers for each entry) in
> > one rush on a background I/O thread from fs driver, then assemble
> > the entire resulting network response message for the readdir
> > request on main I/O thread. The fs driver is still aborting the
> > directory entry retrieval loop (on the background I/O thread) as
> > soon as it would exceed the client's requested maximum R_readdir
> > response size. So we should not have any performance penalty by
> > doing this.
> > 
> > Signed-off-by: Christian Schoenebeck <address@hidden>
> > ---
> 
> Ok so this is it. Not reviewed this huge patch yet but I could at
> least give a try. The gain is impressive indeed:

Tseses, so much scepticism. :)

> [greg@bahia qemu-9p]$ (cd .mbuild-$(stg branch)/obj ; export
> QTEST_QEMU_BINARY='x86_64-softmmu/qemu-system-x86_64'; make all
> tests/qtest/qos-test && for i in {1..100}; do tests/qtest/qos-test -p
> $(tests/qtest/qos-test -l | grep readdir/basic); done) |& awk '/IMPORTANT/
> { print $10 }' | sed -e 's/s//' -e 's/^/n+=1;x+=/;$ascale=6;x/n' | bc
> .009806
> 
> instead of .055654, i.e. nearly 6 times faster ! This sounds promising :)

Like mentioned in the other email, performance improvement by this patch is 
actually far more than factor 6 since you probably just dropped the n-square 
driver hack in your benchmarks (which tainted your benchmark results):

Unoptimized readdir, with n-square correction hack:
Time client spent for waiting for reply from server: 0.082539s [MOST 
IMPORTANT]

Optimized readdir, with n-square correction hack:
Time 9p server spent on synth_readdir() I/O only (synth driver): 0.001576s
Time 9p server spent on entire T_readdir request: 0.002244s [IMPORTANT]
Time client spent for waiting for reply from server: 0.002566s [MOST 
IMPORTANT]

So in this particular test run performance improvement by around factor 32, 
but I also observed factors around 40 before in my tests.

> Now I need to find time to do a decent review... :-\

Sure, take your time! But as you can see, it is really worth it.

And it's not just the performance improvement. This patch also reduces program 
flow complexity significantly, e.g. there is just one lock and one unlock; 
entry name allocation is immediately freed without any potential branch in 
between, and much more. In other words: it adds safety.

Best regards,
Christian Schoenebeck

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v4 00/11] 9pfs: readdir optimization, Christian Schoenebeck, 2020/01/20
- [PATCH v4 10/11] 9pfs: T_readdir latency optimization, Christian Schoenebeck, 2020/01/20
  - Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization, Greg Kurz, 2020/01/23
    - Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization, Christian Schoenebeck <=
- [PATCH v4 05/11] tests/virtio-9p: added readdir test, Christian Schoenebeck, 2020/01/20
  - Re: [PATCH v4 05/11] tests/virtio-9p: added readdir test, Greg Kurz, 2020/01/22
- [PATCH v4 04/11] hw/9pfs/9p-synth: added directory for readdir test, Christian Schoenebeck, 2020/01/20
- [PATCH v4 11/11] hw/9pfs/9p.c: benchmark time on T_readdir request, Christian Schoenebeck, 2020/01/20
- [PATCH v4 09/11] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir(), Christian Schoenebeck, 2020/01/20
  - Re: [PATCH v4 09/11] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir(), Greg Kurz, 2020/01/23
    - Re: [PATCH v4 09/11] hw/9pfs/9p-synth: avoid n-square issue in synth_readdir(), Christian Schoenebeck, 2020/01/23
- [PATCH v4 02/11] 9pfs: require msize >= 4096, Christian Schoenebeck, 2020/01/20
- [PATCH v4 01/11] tests/virtio-9p: add terminating null in v9fs_string_read(), Christian Schoenebeck, 2020/01/20
- [PATCH v4 07/11] tests/virtio-9p: failing splitted readdir test, Christian Schoenebeck, 2020/01/20

Prev by Date: Re: [PATCH REPOST v3 19/80] arm/mcimx7d-sabre: use memdev for RAM
Next by Date: Re: [PATCH REPOST v3 20/80] arm/mps2-tz: use memdev for RAM
Previous by thread: Re: [PATCH v4 10/11] 9pfs: T_readdir latency optimization
Next by thread: [PATCH v4 05/11] tests/virtio-9p: added readdir test
Index(es):
- Date
- Thread