Hi David,
On 17.03.24 09:37, Keqian Zhu via wrote:
For vCPU being hotplugged, qemu_init_vcpu() is called. In this
function, we set vcpu state as stopped, and then wait vcpu thread to
be created.
As the vcpu state is stopped, it will inform us it has been created
and then wait on halt_cond. After we has realized vcpu object, we will
resume the vcpu thread.
However, during we wait vcpu thread to be created, the bql is
unlocked, and other thread is allowed to call resume_all_vcpus(),
which will resume the un-realized vcpu.
This fixes the issue by filter out un-realized vcpu during
resume_all_vcpus().
Similar question: is there a reproducer?
How could we currently hotplug a VCPU, and while it is being created, see
pause_all_vcpus()/resume_all_vcpus() getting claled.
I described the reason for this at patch 1.
If I am not getting this wrong, there seems to be some other mechanism missing
that makes sure that this cannot happen. Dropping the BQL half-way through
creating a VCPU might be the problem.
When we add retry mechanism in pause_all_vcpus(), we can solve this problem.
With the sematic unchanged for user, which means:
With bql, we can make sure all vcpus are paused after pause_all_vcpus() finish,
and all vcpus are resumed after resume_all_vcpus() finish.