[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v7 1/2] memory: Update inline documentation
From: |
Akihiko Odaki |
Subject: |
Re: [PATCH v7 1/2] memory: Update inline documentation |
Date: |
Fri, 17 Jan 2025 15:19:32 +0900 |
User-agent: |
Mozilla Thunderbird |
On 2025/01/17 1:13, BALATON Zoltan wrote:
On Thu, 16 Jan 2025, Peter Maydell wrote:
On Tue, 14 Jan 2025 at 19:12, Peter Xu <peterx@redhat.com> wrote:
On Tue, Jan 14, 2025 at 05:42:57PM +0000, Peter Maydell wrote:
There's at least one test in the arm qtests that will hit this.
I suspect that you'll find that most architectures except x86
(where we don't have models of complex SoCs and the few
machines we do have tend to be old code that is less QOMified)
will hit similar issues. I think there's a general issue here,
this isn't just "some particular ppc device is wrongly coded".
I see. Do you know how many of them would be important memory leaks
that
we should fix immediately?
None of these are important memory leaks, because the device is
almost always present for the lifetime of the simulation. The
only case you'd actually get a visible leak would be if you
could hot-unplug the device, and even then you'd have to
deliberately sit there doing hot-plug-then-unplug cycles to
leak an interesting amount of memory.
The main reason to want to fix them is that it lets us run
"make check" under the sanitizer and catch other, more
interesting leaks.
I mean, we have known memory leaks in QEMU in many places I assume.
I am
curious how important this problem is, and whether such would justify a
memory API change that is not reaching a quorum state (and, imho, add
complexity to memory core and of course that spreads to x86 too even
if it
was not affected) to be merged. Or perhaps we can fix the important
ones
first from the device model directly instead.
The problem is generic, and the problem is that we have not actually
nailed down how this is supposed to work, i.e:
* what are the reference counts counting?
It would be very unintuitive if ref counts not counted the number of
references to the object that contains this ref count. If I understood
correctly that's the reason for this discussion that in this case the
ref count of the owner is counting the MRs instead of its own references
and the MR's ref count is not used which is confusing.
* if a device has this kind of memory region inside another,
how is it supposed to be coded so as to not leak memory?
If we can figure out how the lifecycle and memory management
is supposed to work, then yes, we can fix the relevant device
models so that they follow whatever the rules are. But it seems
If looking for rules to follow one proven and relatively simple set is
what Cocoa uses on macOS. There's a very short introduction here:
https://www.peachpit.com/articles/article.aspx?p=377302&seqNum=2
and a bit longer more complete here:
https://www.tomdalling.com/blog/cocoa/an-in-depth-look-at-manual-memory-
management-in-objective-c/
I think this case is one which is shortly mentioned at the end of the
second link which is solved by not retaining the contained objects so
this is closer to what PeterX suggests just dropping the refcounting on
the owner and consider the MRs owned by the superregions once added
there. This means add subregion would pass the reference to the owner
and remove subregion passes it back to who called it so the MR's ref
count needs no change but then who added the MR should not use it after
passing it to the superregion and who called remove subregion should
release it. But I don't really understans the problem so maybe it's more
complex.
Let me explain my and Peter Xu's solutions by listing object references.
It will abstract the details away and allow applying the analogy of
other reference counting systems like Objective-C.
"Real" reference relationship
-----------------------------
First, I'll introduce one basic rule:
Rule 1. If A needs B, let A refer to B.
Now, There is an owner and a memory region. The owner needs the memory
region to function. The memory region needs the owner to provide to
allocate itself and e.g., to provide MemoryRegionOps for
memory_region_init_io(). So we will make a bi-directional reference
between the owner and the memory region.
O (Owner) -> MR (Memory Region)
MR -> O
Now, let's think the case where there is a container and subregion, and
they are owned by different objects. The container needs the subregion
to satisfy its accesses, and the subregion needs the container for
memory_region_find(). So this relationship is also bi-directional.
CO (Container's Owner) -> C (Container)
C -> CO
SO (Subregions' Owner) -> S (Subregion)
S -> SO
C -> S
S -> C
Let's think someone took a reference to the container e.g., for DMA.
This mean we add the following reference:
DMA Controller -> Container
This will ensure all of Container, Container's Owner, Subregion,
Subregion's Owner are alive during the DMA.
"Weak" references
-----------------
Next, I'll impose the restriction that prevents circular references. To
avoid circular references, we'll stop counting references for one
direction. In other words, we'll make some references "weak". Weak
references are prone to be dangling (in Objective-C or other platforms,
such references will be usually replaced with nil-like values).
Rule 2. Make a reference from a memory region to its owner weak
Rule 3. Make a reference from a subregion to its container weak
CO -> C
C -> CO (weak)
SO -> S
S -> SO (weak)
C -> S
S -> C (weak)
D (DMA Controller) -> C
This will fix circular references, but unfortunately it will not work
because DMA Controller requires weak references to reach to Container's
Owner or Subregion's Owner with "strong" references; it means they can
be dangling at any time.
To solve this problem, we'll introduce another rule:
Rule 4. Replace references to memory regions with references to their
owners *except for the references from the owner to the memory region*.
CO -> C
C -> CO (weak)
SO -> S
S -> SO (weak)
C -> SO
S -> C (weak)
D -> CO
With this change, D can again reach all of CO, C, SO, and S.
Problem
-------
What if Container's Owner and Subregion's Owner are identical?
O -> C
C -> O (weak)
O -> S
S -> O (weak)
C -> O
S -> C (weak)
D -> O
Now there is a circular reference: O -> C -> O
My patch
--------
My patch amends rule 4:
Rule 4'. Replace references to memory regions with references to their
owners *unless the reference originates from the owner*.
O -> C
C -> O (weak)
O -> S
S -> O (weak)
C -> S (this used to be C -> O)
S -> C (weak)
D -> O
Now the circular reference is gone.
Peter Xu's patch
----------------
Peter Xu's patch adds two new rules:
Rule 5. Make references from a container to its subregion weak if they
are owned by the same object.
O -> C
C -> O (weak)
O -> S
S -> O (weak)
C -> O (weak [this used to be strong])
S -> C (weak)
D -> O
Now the circular reference is gone and D can still reach O, C, and S.
However, there is one problem: C requires a weak reference to reach to
S. More concretely the pointer from C to S will be dangling when the
reference O -> S gets dropped during the finalization of O.
Peter Xu's patch adds yet another rule to overcome this problem:
Rule 6. When a subregion gets finalized, drop the weak reference from
its container to itself.
The dangling reference will be gone with this rule.
Comparison between the two patches
----------------------------------
I'll compare them by showing pros and cons of Peter Xu's patch.
Pro:
a. Will not add yet another direct reference to memory regions where
currently there can be only one reference from the owner to each memory
region.
Cons:
b. It requires two new rules.
c. C will see its reference to S to disappear at random timing due to a
weak reference.
So, which is better do you think?
- Re: [PATCH v7 1/2] memory: Update inline documentation, (continued)
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/10
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/10
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/10
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/13
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/14
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/14
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Maydell, 2025/01/14
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/14
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Maydell, 2025/01/16
- Re: [PATCH v7 1/2] memory: Update inline documentation, BALATON Zoltan, 2025/01/16
- Re: [PATCH v7 1/2] memory: Update inline documentation,
Akihiko Odaki <=
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/16
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/14
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/15
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/15
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/15
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/15
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/15
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/16
- Re: [PATCH v7 1/2] memory: Update inline documentation, Peter Xu, 2025/01/16
- Re: [PATCH v7 1/2] memory: Update inline documentation, Akihiko Odaki, 2025/01/17