qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: virtio-fs performance


From: Derek Su
Subject: Re: virtio-fs performance
Date: Tue, 4 Aug 2020 15:37:26 +0800

Hello,

Set the cache=none in virtiofsd and direct=1 in fio,
here are the results and kvm-exit count in 5 seconds.

--thread-pool-size=64 (default)
    seq read: 307 MB/s (kvm-exit count=1076463)
    seq write: 430 MB/s (kvm-exit count=1302493)
    rand 4KB read: 65.2k IOPS (kvm-exit count=1322899)
    rand 4KB write: 97.2k IOPS (kvm-exit count=1568618)

--thread-pool-size=1
    seq read: 303 MB/s (kvm-exit count=1034614)
    seq write: 358 MB/s. (kvm-exit count=1537735)
    rand 4KB read: 7995 IOPS (kvm-exit count=438348)
    rand 4KB write: 97.7k IOPS (kvm-exit count=1907585)

The thread-pool-size=64 improves the rand 4KB read performance largely,
but doesn't increases the kvm-exit count too much.

In addition, the fio avg. clat of rand 4K write are 960us for
thread-pool-size=64 and 7700us for thread-pool-size=1.

Regards,
Derek

Stefan Hajnoczi <stefanha@redhat.com> 於 2020年7月28日 週二 下午9:49寫道:
>
> > I'm trying and testing the virtio-fs feature in QEMU v5.0.0.
> > My host and guest OS are both ubuntu 18.04 with kernel 5.4, and the
> > underlying storage is one single SSD.
> >
> > The configuations are:
> > (1) virtiofsd
> > ./virtiofsd -o
> > source=/mnt/ssd/virtiofs,cache=auto,flock,posix_lock,writeback,xattr
> > --thread-pool-size=1 --socket-path=/tmp/vhostqemu
> >
> > (2) qemu
> > qemu-system-x86_64 \
> > -enable-kvm \
> > -name ubuntu \
> > -cpu Westmere \
> > -m 4096 \
> > -global kvm-apic.vapic=false \
> > -netdev 
> > tap,id=hn0,vhost=off,br=br0,helper=/usr/local/libexec/qemu-bridge-helper
> > \
> > -device e1000,id=e0,netdev=hn0 \
> > -blockdev '{"node-name": "disk0", "driver": "qcow2",
> > "refcount-cache-size": 1638400, "l2-cache-size": 6553600, "file": {
> > "driver": "file", "filename": "'${imagefolder}\/ubuntu.qcow2'"}}' \
> > -device virtio-blk,drive=disk0,id=disk0 \
> > -chardev socket,id=ch0,path=/tmp/vhostqemu \
> > -device vhost-user-fs-pci,chardev=ch0,tag=myfs \
> > -object memory-backend-memfd,id=mem,size=4G,share=on \
> > -numa node,memdev=mem \
> > -qmp stdio \
> > -vnc :0
> >
> > (3) guest
> > mount -t virtiofs myfs /mnt/virtiofs
> >
> > I tried to change virtiofsd's --thread-pool-size value and test the
> > storage performance by fio.
> > Before each read/write/randread/randwrite test, the pagecaches of
> > guest and host are dropped.
> >
> > ```
> > RW="read" # or write/randread/randwrite
> > fio --name=test --rw=$RW --bs=4k --numjobs=1 --ioengine=libaio
> > --runtime=60 --direct=0 --iodepth=64 --size=10g
> > --filename=/mnt/virtiofs/testfile
> > done
> > ```
> >
> > --thread-pool-size=64 (default)
> >     seq read: 305 MB/s
> >     seq write: 118 MB/s
> >     rand 4KB read: 2222 IOPS
> >     rand 4KB write: 21100 IOPS
> >
> > --thread-pool-size=1
> >     seq read: 387 MB/s
> >     seq write: 160 MB/s
> >     rand 4KB read: 2622 IOPS
> >     rand 4KB write: 30400 IOPS
> >
> > The results show the performance using default-pool-size (64) is
> > poorer than using single thread.
> > Is it due to the lock contention of the multiple threads?
> > When can virtio-fs get better performance using multiple threads?
> >
> >
> > I also tested the performance that guest accesses host's files via
> > NFSv4/CIFS network filesystem.
> > The "seq read" and "randread" performance of virtio-fs are also worse
> > than the NFSv4 and CIFS.
> >
> > NFSv4:
> >   seq write: 244 MB/s
> >   rand 4K read: 4086 IOPS
> >
> > I cannot figure out why the perf of NFSv4/CIFS with the network stack
> > is better than virtio-fs.
> > Is it expected? Or, do I have an incorrect configuration?
>
> No, I remember benchmarking the thread pool and did not see such a big
> difference.
>
> Please use direct=1 so that each I/O results in a virtio-fs request.
> Otherwise the I/O pattern is not directly controlled by the benchmark
> but by the page cache (readahead, etc).
>
> Using numactl(8) or taskset(1) to launch virtiofsd allows you to control
> NUMA and CPU scheduling properties. For example, you could force all 64
> threads to run on the same host CPU using taskset to see if that helps
> this I/O bound workload.
>
> fio can collect detailed statistics on queue depths and a latency
> histogram. It would be interesting to compare the --thread-pool-size=64
> and --thread-pool-size=1 numbers.
>
> Comparing the "perf record -e kvm:kvm_exit" counts between the two might
> also be interesting.
>
> Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]