|
From: | Oliver Francke |
Subject: | Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] |
Date: | Fri, 09 Aug 2013 11:22:00 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 |
Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote:
On 08/08/2013 05:40 AM, Oliver Francke wrote:Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;)Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs,but it seems likely it's a bug in rbd and not qemu. Thanks! JoshThnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:Am 02.08.2013 um 23:47 schrieb Mike Dawson <address@hidden>:We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see:If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations:On a Windows guest with 1 vCPU, you may see the symptom that the guest nolonger responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck.Basically, the symptoms depend not just on how QEMU is behaving but alsoon the guest kernel and how many vCPUs you have configured.I think this can explain how both problems you are observing, Oliver andMike, are a result of the same bug. At least I hope they are :). Stefan
-- Oliver Francke filoo GmbH Moltkestraße 25a 33330 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
[Prev in Thread] | Current Thread | [Next in Thread] |