qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 1/2] memory: Update inline documentation


From: BALATON Zoltan
Subject: Re: [PATCH v7 1/2] memory: Update inline documentation
Date: Thu, 16 Jan 2025 17:13:00 +0100 (CET)

On Thu, 16 Jan 2025, Peter Maydell wrote:
On Tue, 14 Jan 2025 at 19:12, Peter Xu <peterx@redhat.com> wrote:

On Tue, Jan 14, 2025 at 05:42:57PM +0000, Peter Maydell wrote:
There's at least one test in the arm qtests that will hit this.
I suspect that you'll find that most architectures except x86
(where we don't have models of complex SoCs and the few
machines we do have tend to be old code that is less QOMified)
will hit similar issues. I think there's a general issue here,
this isn't just "some particular ppc device is wrongly coded".

I see.  Do you know how many of them would be important memory leaks that
we should fix immediately?

None of these are important memory leaks, because the device is
almost always present for the lifetime of the simulation. The
only case you'd actually get a visible leak would be if you
could hot-unplug the device, and even then you'd have to
deliberately sit there doing hot-plug-then-unplug cycles to
leak an interesting amount of memory.

The main reason to want to fix them is that it lets us run
"make check" under the sanitizer and catch other, more
interesting leaks.

I mean, we have known memory leaks in QEMU in many places I assume.  I am
curious how important this problem is, and whether such would justify a
memory API change that is not reaching a quorum state (and, imho, add
complexity to memory core and of course that spreads to x86 too even if it
was not affected) to be merged.  Or perhaps we can fix the important ones
first from the device model directly instead.

The problem is generic, and the problem is that we have not actually
nailed down how this is supposed to work, i.e:
* what are the reference counts counting?

It would be very unintuitive if ref counts not counted the number of references to the object that contains this ref count. If I understood correctly that's the reason for this discussion that in this case the ref count of the owner is counting the MRs instead of its own references and the MR's ref count is not used which is confusing.

* if a device has this kind of memory region inside another,
  how is it supposed to be coded so as to not leak memory?

If we can figure out how the lifecycle and memory management
is supposed to work, then yes, we can fix the relevant device
models so that they follow whatever the rules are. But it seems

If looking for rules to follow one proven and relatively simple set is what Cocoa uses on macOS. There's a very short introduction here:
https://www.peachpit.com/articles/article.aspx?p=377302&seqNum=2
and a bit longer more complete here:
https://www.tomdalling.com/blog/cocoa/an-in-depth-look-at-manual-memory-management-in-objective-c/

I think this case is one which is shortly mentioned at the end of the second link which is solved by not retaining the contained objects so this is closer to what PeterX suggests just dropping the refcounting on the owner and consider the MRs owned by the superregions once added there. This means add subregion would pass the reference to the owner and remove subregion passes it back to who called it so the MR's ref count needs no change but then who added the MR should not use it after passing it to the superregion and who called remove subregion should release it. But I don't really understans the problem so maybe it's more complex.

Regards,
BALATON Zoltan

to me that at the moment we have not got a consensus on how
this is supposed to work. Until we have that, there's no way to
fix this at the device model level, because we don't know what
changes we need to make.

thanks
-- PMM





reply via email to

[Prev in Thread] Current Thread [Next in Thread]