qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15


From: Fabiano Rosas
Subject: Re: [PATCH 00/33] hw/cpu/arm: Remove one use of qemu_get_cpu() in A7/A15 MPCore priv
Date: Wed, 10 Jan 2024 10:19:44 -0300

Markus Armbruster <armbru@redhat.com> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> On Tue, Jan 09, 2024 at 10:22:31PM +0100, Philippe Mathieu-Daudé wrote:
>>> Hi Fabiano,
>>> 
>>> On 9/1/24 21:21, Fabiano Rosas wrote:
>>> > Cédric Le Goater <clg@kaod.org> writes:
>>> > 
>>> > > On 1/9/24 18:40, Fabiano Rosas wrote:
>>> > > > Cédric Le Goater <clg@kaod.org> writes:
>>> > > > 
>>> > > > > On 1/3/24 20:53, Fabiano Rosas wrote:
>>> > > > > > Philippe Mathieu-Daudé <philmd@linaro.org> writes:
>>> > > > > > 
>>> > > > > > > +Peter/Fabiano
>>> > > > > > > 
>>> > > > > > > On 2/1/24 17:41, Cédric Le Goater wrote:
>>> > > > > > > > On 1/2/24 17:15, Philippe Mathieu-Daudé wrote:
>>> > > > > > > > > Hi Cédric,
>>> > > > > > > > > 
>>> > > > > > > > > On 2/1/24 15:55, Cédric Le Goater wrote:
>>> > > > > > > > > > On 12/12/23 17:29, Philippe Mathieu-Daudé wrote:
>>> > > > > > > > > > > Hi,
>>> > > > > > > > > > > 
>>> > > > > > > > > > > When a MPCore cluster is used, the Cortex-A cores 
>>> > > > > > > > > > > belong the the
>>> > > > > > > > > > > cluster container, not to the board/soc layer. This 
>>> > > > > > > > > > > series move
>>> > > > > > > > > > > the creation of vCPUs to the MPCore private container.
>>> > > > > > > > > > > 
>>> > > > > > > > > > > Doing so we consolidate the QOM model, moving common 
>>> > > > > > > > > > > code in a
>>> > > > > > > > > > > central place (abstract MPCore parent).
>>> > > > > > > > > > 
>>> > > > > > > > > > Changing the QOM hierarchy has an impact on the state of 
>>> > > > > > > > > > the machine
>>> > > > > > > > > > and some fixups are then required to maintain migration 
>>> > > > > > > > > > compatibility.
>>> > > > > > > > > > This can become a real headache for KVM machines like 
>>> > > > > > > > > > virt for which
>>> > > > > > > > > > migration compatibility is a feature, less for emulated 
>>> > > > > > > > > > ones.
>>> > > > > > > > > 
>>> > > > > > > > > All changes are either moving properties (which are not 
>>> > > > > > > > > migrated)
>>> > > > > > > > > or moving non-migrated QOM members (i.e. pointers of 
>>> > > > > > > > > ARMCPU, which
>>> > > > > > > > > is still migrated elsewhere). So I don't see any obvious 
>>> > > > > > > > > migration
>>> > > > > > > > > problem, but I might be missing something, so I Cc'ed Juan 
>>> > > > > > > > > :>
>>> > > > > > 
>>> > > > > > FWIW, I didn't spot anything problematic either.
>>> > > > > > 
>>> > > > > > I've ran this through my migration compatibility series [1] and it
>>> > > > > > doesn't regress aarch64 migration from/to 8.2. The tests use '-M
>>> > > > > > virt -cpu max', so the cortex-a7 and cortex-a15 are not covered. 
>>> > > > > > I don't
>>> > > > > > think we even support migration of anything non-KVM on arm.
>>> > > > > 
>>> > > > > it happens we do.
>>> > > > > 
>>> > > > 
>>> > > > Oh, sorry, I didn't mean TCG here. Probably meant to say something 
>>> > > > like
>>> > > > non-KVM-capable cpus, as in 32-bit. Nevermind.
>>> > > 
>>> > > Theoretically, we should be able to migrate to a TCG guest. Well, this
>>> > > worked in the past for PPC. When I was doing more KVM related changes,
>>> > > this was very useful for dev. Also, some machines are partially 
>>> > > emulated.
>>> > > Anyhow I agree this is not a strong requirement and we often break it.
>>> > > Let's focus on KVM only.
>>> > > 
>>> > > > > > 1- https://gitlab.com/farosas/qemu/-/jobs/5853599533
>>> > > > > 
>>> > > > > yes it depends on the QOM hierarchy and virt seems immune to the 
>>> > > > > changes.
>>> > > > > Good.
>>> > > > > 
>>> > > > > However, changing the QOM topology clearly breaks migration compat,
>>> > > > 
>>> > > > Well, "clearly" is relative =) You've mentioned pseries and aspeed
>>> > > > already, do you have a pointer to one of those cases were we broke
>>> > > > migration
>>> > > 
>>> > > Regarding pseries, migration compat broke because of 5bc8d26de20c
>>> > > ("spapr: allocate the ICPState object from under sPAPRCPUCore") which
>>> > > is similar to the changes proposed by this series, it impacts the QOM
>>> > > hierarchy. Here is the workaround/fix from Greg : 46f7afa37096
>>> > > ("spapr: fix migration of ICPState objects from/to older QEMU") which
>>> > > is quite an headache and this turned out to raise another problem some
>>> > > months ago ... :/ That's why I sent [1] to prepare removal of old
>>> > > machines and workarounds becoming a burden.
>>> > 
>>> > This feels like something that could be handled by the vmstate code
>>> > somehow. The state is there, just under a different path.
>>> 
>>> What, the QOM path is used in migration? ...
>>
>> Hopefully not..

Unfortunately the original fix doesn't mention _what_ actually broke
with migration. I assumed the QOM path was needed because otherwise I
don't think the fix makes sense. The thread discussing that patch also
directly mentions the QOM path:

https://www.mail-archive.com/qemu-devel@nongnu.org/msg450912.html

But I probably misunderstood something while reading that thread.

>>
>>> 
>>> See recent discussions on "QOM path stability":
>>> ZZfYvlmcxBCiaeWE@redhat.com/">https://lore.kernel.org/qemu-devel/ZZfYvlmcxBCiaeWE@redhat.com/
>>> 87jzojbxt7.fsf@pond.sub.org/">https://lore.kernel.org/qemu-devel/87jzojbxt7.fsf@pond.sub.org/
>>> 87v883by34.fsf@pond.sub.org/">https://lore.kernel.org/qemu-devel/87v883by34.fsf@pond.sub.org/
>>
>> If I read it right, the commit 46f7afa37096 example is pretty special that
>> the QOM path more or less decided more than the hierachy itself but changes
>> the existances of objects.
>
> Let's see whether I got this...
>
> We removed some useless objects, moved the useful ones to another home.
> The move changed their QOM path.
>
> The problem was the removal of useless objects, because this also
> removed their vmstate.

If you checkout at the removal commit (5bc8d26de20c), the vmstate has
been kept untouched.

>
> The fix was adding the vmstate back as a dummy.

Since the vmstate was kept I don't see why would we need a dummy. The
incoming migration stream would still have the state, only at a
different point in the stream. It's surprising to me that that would
cause an issue, but I'm not well versed in that code.

>
> The QOM patch changes are *not* part of the problem.

The only explanation I can come up with is that after the patch
migration has broken after a hotplug or similar operation. In such
situation, the preallocated state would always be present before the
patch, but sometimes not present after the patch in case, say, a
hot-unplug has taken away a cpu + ICPState.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]