On Thu, Jan 16, 2025 at 02:37:38PM +0900, Akihiko Odaki wrote:
On 2025/01/16 1:14, Peter Xu wrote:
On Thu, Jan 16, 2025 at 12:52:56AM +0900, Akihiko Odaki wrote:
Functionally, the ordering of container/subregion finalization matters if
some device tries to a container during finalization. In such a case,
|
^ something is missing here, feel free to complete this.
Oops, I meant: functionally, the ordering of container/subregion
finalization matters if some device tries to use a container during
finalization.
This is true, though if we keep the concept of "all the MRs share the same
lifecycle of the owner" idea, another fix of such is simply moving the
container access before any detachment of MRs.
removing subregions from the container at random timing can result in an
unexpected behavior. There is little chance to have such a scenario but we
should stay the safe side if possible.
It sounds like a future feature, and I'm not sure we'll get there, so I
don't worry that much. Keeping refcount core idea simple is still very
attractive to me. I still prefer we have complete MR refcounting iff when
necessary. It's also possible it'll never happen to QEMU.
It's not just about the future but also about compatibility with the current
device implementations. I will not be surprised even if the random ordering
of subregion finalization breaks one of dozens of devices we already have.
We should pay attention the details as we are touching the core
infrastructure.
Yes, if we can find any such example that we must follow the order of MR
destruction, I think that could justify your approach will be required but
not optional. It's just that per my understanding there should be none,
and even if there're very few outliers, it can still be trivially fixed as
mentioned above.
My gut feeling is when we need serious MR refcounting (I'd expect due to
the current heavy code base it might be easier to start a new project if
that's required.. that's why I was thinking maybe it will not happen.. but
if it will..), we'll do more than your change, and that also means
memory_region_ref() must start to refcount MRs, because a serious MR
separate refcounting should mean MR can go on the fly before the owner.