[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH v5 0/7] Live Migration With IAA
From: |
Liu, Yuan1 |
Subject: |
RE: [PATCH v5 0/7] Live Migration With IAA |
Date: |
Thu, 28 Mar 2024 03:02:30 +0000 |
> -----Original Message-----
> From: Peter Xu <peterx@redhat.com>
> Sent: Thursday, March 28, 2024 3:46 AM
> To: Liu, Yuan1 <yuan1.liu@intel.com>
> Cc: farosas@suse.de; qemu-devel@nongnu.org; hao.xiang@bytedance.com;
> bryan.zhang@bytedance.com; Zou, Nanhai <nanhai.zou@intel.com>
> Subject: Re: [PATCH v5 0/7] Live Migration With IAA
>
> On Wed, Mar 27, 2024 at 03:20:19AM +0000, Liu, Yuan1 wrote:
> > > -----Original Message-----
> > > From: Peter Xu <peterx@redhat.com>
> > > Sent: Wednesday, March 27, 2024 4:30 AM
> > > To: Liu, Yuan1 <yuan1.liu@intel.com>
> > > Cc: farosas@suse.de; qemu-devel@nongnu.org; hao.xiang@bytedance.com;
> > > bryan.zhang@bytedance.com; Zou, Nanhai <nanhai.zou@intel.com>
> > > Subject: Re: [PATCH v5 0/7] Live Migration With IAA
> > >
> > > Hi, Yuan,
> > >
> > > On Wed, Mar 20, 2024 at 12:45:20AM +0800, Yuan Liu wrote:
> > > > 1. QPL will be used as an independent compression method like ZLIB
> and
> > > ZSTD,
> > > > QPL will force the use of the IAA accelerator and will not
> support
> > > software
> > > > compression. For a summary of issues compatible with Zlib, please
> > > refer to
> > > > docs/devel/migration/qpl-compression.rst
> > >
> > > IIRC our previous discussion is we should provide a software fallback
> for
> > > the new QEMU paths, right? Why the decision changed? Again, such
> > > fallback
> > > can help us to make sure qpl won't get broken easily by other changes.
> >
> > Hi Peter
> >
> > Previous your suggestion below
> >
> >
> https://patchew.org/QEMU/PH7PR11MB5941019462E0ADDE231C7295A37C2@PH7PR11MB5
> 941.namprd11.prod.outlook.com/
> > Compression methods: none, zlib, zstd, qpl (describes all the algorithms
> > that might be used; again, qpl enforces HW support).
> > Compression accelerators: auto, none, qat (only applies when zlib/zstd
> > chosen above)
> >
> > Maybe I misunderstood here, what you mean is that if the IAA hardware is
> unavailable,
> > it will fall back to the software path. This does not need to be
> specified through live
> > migration parameters, and it will automatically determine whether to use
> the software or
> > hardware path during QPL initialization, is that right?
>
> I think there are two questions.
>
> Firstly, we definitely want the qpl compressor to be able to run without
> any hardware support. As I mentioned above, I think that's the only way
> that qpl code can always get covered by the CI as CI hosts should normally
> don't have those modern hardwares.
>
> I think it also means in the last test patch, instead of detecting
> /dev/iax
> we should unconditionally run the qpl test as long as compiled in, because
> it should just fallback to the software path then when HW not valid?
>
> The second question is whether we'll want a new "compression accelerator",
> fundamentally the only use case of that is to enforce software fallback
> even if hardware existed. I don't remember whether others have any
> opinion
> before, but to me I think it's good to have, however no strong opinion.
> It's less important comparing to the other question on CI coverage.
Yes, I will support software fallback to ensure CI testing and users can
still use qpl compression without IAA hardware.
Although the qpl software solution will have better performance than zlib,
I still don't think it has a greater advantage than zstd. I don't think there
is a need to add a migration option to configure the qpl software or hardware
path.
So I will still only use QPL as an independent compression in the next version,
and
no other migration options are needed.
I will also add a guide to qpl-compression.rst about IAA permission issues and
how to
determine whether the hardware path is available.
> > > > 2. Compression accelerator related patches are removed from this
> patch
> > > set and
> > > > will be added to the QAT patch set, we will submit separate
> patches
> > > to use
> > > > QAT to accelerate ZLIB and ZSTD.
> > > >
> > > > 3. Advantages of using IAA accelerator include:
> > > > a. Compared with the non-compression method, it can improve
> downtime
> > > > performance without adding additional host resources (both CPU
> and
> > > > network).
> > > > b. Compared with using software compression methods (ZSTD/ZLIB),
> it
> > > can
> > > > provide high data compression ratio and save a lot of CPU
> > > resources
> > > > used for compression.
> > > >
> > > > Test condition:
> > > > 1. Host CPUs are based on Sapphire Rapids
> > > > 2. VM type, 16 vCPU and 64G memory
> > > > 3. The source and destination respectively use 4 IAA devices.
> > > > 4. The workload in the VM
> > > > a. all vCPUs are idle state
> > > > b. 90% of the virtual machine's memory is used, use silesia to
> fill
> > > > the memory.
> > > > The introduction of silesia:
> > > > https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia
> > > > 5. Set "--mem-prealloc" boot parameter on the destination, this
> > > parameter
> > > > can make IAA performance better and related introduction is
> added
> > > here.
> > > > docs/devel/migration/qpl-compression.rst
> > > > 6. Source migration configuration commands
> > > > a. migrate_set_capability multifd on
> > > > b. migrate_set_parameter multifd-channels 2/4/8
> > > > c. migrate_set_parameter downtime-limit 300
> > > > f. migrate_set_parameter max-bandwidth 100G/1G
> > > > d. migrate_set_parameter multifd-compression none/qpl/zstd
> > > > 7. Destination migration configuration commands
> > > > a. migrate_set_capability multifd on
> > > > b. migrate_set_parameter multifd-channels 2/4/8
> > > > c. migrate_set_parameter multifd-compression none/qpl/zstd
> > > >
> > > > Early migration result, each result is the average of three tests
> > > >
> > > > +--------+-------------+--------+--------+---------+----------+----
> --|
> > > > | | The number |total |downtime|network |pages per | CPU
> |
> > > > | None | of channels |time(ms)|(ms) |bandwidth|second |
> Util |
> > > > | Comp | | | |(mbps) | |
> |
> > > > | +-------------+-----------------+---------+----------+----
> --+
> > > > |Network | 2| 8571| 69| 58391| 1896525|
> 256%|
> > >
> > > Is this the average bandwidth? I'm surprised that you can hit ~59Gbps
> > > only
> > > with 2 channels. My previous experience is around ~1XGbps per
> channel, so
> > > no more than 30Gbps for two channels. Is it because of a faster
> > > processor?
> > > Indeed from the 4/8 results it doesn't look like increasing the num of
> > > channels helped a lot, and even it got worse on the downtime.
> >
> > Yes, I use iperf3 to check the bandwidth for one core, the bandwith is
> 60Gbps.
> > [ ID] Interval Transfer Bitrate Retr Cwnd
> > [ 5] 0.00-1.00 sec 7.00 GBytes 60.1 Gbits/sec 0 2.87 MBytes
> > [ 5] 1.00-2.00 sec 7.05 GBytes 60.6 Gbits/sec 0 2.87 Mbytes
> >
> > And in the live migration test, a multifd thread's CPU utilization is
> almost 100%
>
> This 60Gpbs per-channel is definitely impressive..
>
> Have you tried migration without multifd on your system? Would that also
> perform similarly v.s. 2 channels multifd?
Simple Test result below:
VM Type: 16vCPU, 64G memory
Workload in VM: fill 56G memory with Silesia data and vCPUs are idle
Migration Configurations:
1. migrate_set_parameter max-bandwidth 100G
2. migrate_set_parameter downtime-limit 300
3. migrate_set_capability multifd on (multiFD test case)
4. migrate_set_parameter multifd-channels 2 (multiFD test case)
Totaltime (ms) Downtime (ms) Throughput (mbps)
Pages-per-second
without Multifd 23580 307 21221 689588
Multifd 2 7657 198 65410 2221176
>
> The whole point of multifd is to scale on bandwidth. If single thread can
> already achieve 60Gbps (where in my previous memory of tests, multifd can
> only reach ~70Gbps before..), then either multifd will be less useful with
> the new hardwares (especially when with a most generic socket nocomp
> setup), or we need to start working on bottlenecks of multifd to make it
> scale better. Otherwise multifd will become a pool for compressor loads
> only.
>
> >
> > > What is the rational behind "downtime improvement" when with the QPL
> > > compressors? IIUC in this 100Gbps case the bandwidth is never a
> > > limitation, then I don't understand why adding the compression phase
> can
> > > make the switchover faster. I can expect much more pages sent in a
> > > NIC-limted env like you described below with 1Gbps, but not when NIC
> has
> > > unlimited resources like here.
> >
> > The compression can improve the network stack overhead(not improve the
> RDMA
> > solution), the less data, the smaller the overhead in the
> > network protocol stack. If compression has no overhead, and network
> bandwidth
> > is not limited, the last memory copy is faster with compression
> >
> > The migration hotspot focuses on the _sys_sendmsg
> > _sys_sendmsg
> > |- tcp_sendmsg
> > |- copy_user_enhanced_fast_string
> > |- tcp_push_one
>
> Makes sense. I assume that's logical indeed when the compression ratio is
> high enough, meanwhile if the compression work is fast enough to be much
> lower than sending extra data when without it.
>
> Thanks,
>
> --
> Peter Xu
- RE: [PATCH v5 3/7] configure: add --enable-qpl build option, (continued)
Re: [PATCH v5 0/7] Live Migration With IAA, Peter Xu, 2024/03/26