[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-stable] [PATCH v2 0/5] dataplane snapshot fixes
From: |
Denis V. Lunev |
Subject: |
Re: [Qemu-stable] [PATCH v2 0/5] dataplane snapshot fixes |
Date: |
Tue, 27 Oct 2015 22:05:55 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 10/27/2015 09:41 PM, Paolo Bonzini wrote:
On 27/10/2015 15:09, Denis V. Lunev wrote:
The following test
while /bin/true ; do
virsh snapshot-create rhel7
sleep 10
virsh snapshot-delete rhel7 --current
done
with enabled iothreads on a running VM leads to a lot of troubles: hangs,
asserts, errors.
Though (in general) HMP snapshot code is terrible. I think it should be
dropped at once and replaced with blkdev transactions code. Though is
could not fit to QEMU 2.5/stable at all.
Anyway, I think that the construction like
assert(aio_context_is_locked(aio_context));
should be widely used to ensure proper locking.
Changes from v1:
- aio-context locking added
- comment is rewritten
Signed-off-by: Denis V. Lunev <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Paolo Bonzini <address@hidden>
For patches 4-5:
Reviewed-by: Paolo Bonzini <address@hidden>
For patches 1-3 I'm not sure, because we will remove RFifoLock
relatively soon and regular pthread recursive mutexes do not have an
equivalent of rfifolock_is_locked.
Paolo
This does not break any future.
Yes, FifoLock will go away, but aio_context_is_locked will
survive like it stays in the kernel code. We can either have
plain pthread_mutex_try_lock/unlock at first or we can
have additional stubs for linux with checks like this
(gdb) p *(pthread_mutex_t*)0x6015a0
$3 = {
__data = {
__lock = 2,
__count = 0,
__owner = 12276, <== LWP12276 is Thread 3
__nusers = 1,
__kind = 0, <== non-recursive
__spins = 0,
__list = {
__prev = 0x0,
__next = 0x0
}
},
__size = "\002\000\000\000\000\000\000\000\364/\000\000\001",'\000'
<repeats26 times>,
__align = 2
}
in debug mode. Yes, they relays on internal representation,
but they are useful.
This assert was VERY useful for me. I presume that there are
a LOT of similar places in the code with different functions
where aio_context lock was not acquired and there was no
way to ensure consistency.
Den
- [Qemu-stable] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking, (continued)
- [Qemu-stable] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking, Denis V. Lunev, 2015/10/27
- [Qemu-stable] [PATCH 1/5] fifolock: create rfifolock_is_locked helper, Denis V. Lunev, 2015/10/27
- [Qemu-stable] [PATCH 2/5] aio_context: create aio_context_is_locked helper, Denis V. Lunev, 2015/10/27
- [Qemu-stable] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save, Denis V. Lunev, 2015/10/27
- [Qemu-stable] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm, Denis V. Lunev, 2015/10/27
- Re: [Qemu-stable] [PATCH v2 0/5] dataplane snapshot fixes, Paolo Bonzini, 2015/10/27
- Re: [Qemu-stable] [PATCH v2 0/5] dataplane snapshot fixes,
Denis V. Lunev <=