[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RESEND RFC 03/10] qapi/migration: Introduce periodic CPU thro
From: |
Peter Xu |
Subject: |
Re: [PATCH RESEND RFC 03/10] qapi/migration: Introduce periodic CPU throttling parameters |
Date: |
Tue, 10 Sep 2024 09:56:07 -0400 |
On Tue, Sep 10, 2024 at 01:47:04PM +0800, Yong Huang wrote:
> On Tue, Sep 10, 2024 at 5:30 AM Peter Xu <peterx@redhat.com> wrote:
>
> > On Mon, Sep 09, 2024 at 10:25:36PM +0800, Hyman Huang wrote:
> > > To activate the periodic CPU throttleing feature, introduce
> > > the cpu-periodic-throttle.
> > >
> > > To control the frequency of throttling, introduce the
> > > cpu-periodic-throttle-interval.
> > >
> > > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> >
> > Considering that I would still suggest postcopy over auto-converge, IMO we
> >
>
> We are considering the hybrid of precopy and postcopy in fact, and i
> entirely agree with what you are saying: postcopy migration is an
> alternative
> solution to deal with migrations that refuse to converge, or take too long
> to converge. But enabling this feature may not be easy in production since
> the
> recovery requires upper apps to interface, the hugepages and spdk/dpdk
Libvirt should support recovery, while vhost-user should also be supported
in general by both qemu/libvirt. Huge page is indeed still the issue,
though.
> scenarios also need to be considered and re-test.
> Considering auto-converge is the main policy in the industry, the
> optimization
> may still make sense. We would like to try to optimize the auto-converge in
> huge
> VM case and, IMHO, it doesn't conflict with postcopy.
Yeah, that's OK.
>
>
> > should be cautious on adding more QMP interfaces on top of auto-converge,
> > because that means more maintenance burden everywhere.. and it's against
> > our goal to provide, hopefully, one solution for the long term for
> > convergence issues.
> >
> > Postcopy has a major issue with VFIO, but auto converge isn't anything
> > better from that regard.. as we can't yet throttle a device so far anyway.
> > Throttling of DMA probably means DMA faults, then postcopy might be doable
> > too. Meanwhile we're looking at working out 1G postcopy at some point.
> >
> > So I wonder whether we can make any further optmization for auto-converge
> > (if we still really want that..) to be at least transparent, so that they
> >
>
> Thanks for the advice and of course yes.
> So, at first, We'll try to avoid adding the new periodic throttle parameter
> and make it be transparent ?
That'll be my take on this, so we can keep relatively focused for hopefully
all migration developers around QEMU in the near future. I wonder this
could be a good measure so we at least try to reduce part of the burden.
I don't think it's a published rule, it's just something I thought about
when glancing your series. So feel free to share your thoughts. Btw I'll
not be able to read into details yet in the next few days due to flooded
inbox.. sorry for that. But I'll come back after I flush the rest.
Thanks,
--
Peter Xu
- [PATCH RESEND RFC 00/10] migration: auto-converge refinements for huge VM, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 04/10] qapi/migration: Introduce the iteration-count, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 05/10] migration: Introduce util functions for periodic CPU throttle, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 06/10] migration: Support periodic CPU throttle, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 07/10] tests/migration-tests: Add test case for periodic throttle, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 10/10] tests/migration-tests: Add test case for responsive CPU throttle, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 09/10] migration: Support responsive CPU throttle, Hyman Huang, 2024/09/09
- [PATCH RESEND RFC 08/10] migration: Introduce cpu-responsive-throttle parameter, Hyman Huang, 2024/09/09