[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x
From: |
Sage Weil |
Subject: |
Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686] |
Date: |
Tue, 13 Aug 2013 14:34:45 -0700 (PDT) |
User-agent: |
Alpine 2.00 (DEB 1167 2008-08-23) |
Hi Oliver,
(Posted this on the bug too, but:)
Your last log revealed a bug in the librados aio flush. A fix is pushed
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest
please (with caching off again)?
Thanks!
sage
On Fri, 9 Aug 2013, Oliver Francke wrote:
> Hi Josh,
>
> just opened
>
> http://tracker.ceph.com/issues/5919
>
> with all collected information incl. debug-log.
>
> Hope it helps,
>
> Oliver.
>
> On 08/08/2013 07:01 PM, Josh Durgin wrote:
> > On 08/08/2013 05:40 AM, Oliver Francke wrote:
> > > Hi Josh,
> > >
> > > I have a session logged with:
> > >
> > > debug_ms=1:debug_rbd=20:debug_objectcacher=30
> > >
> > > as you requested from Mike, even if I think, we do have another story
> > > here, anyway.
> > >
> > > Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
> > > 3.2.0-51-amd...
> > >
> > > Do you want me to open a ticket for that stuff? I have about 5MB
> > > compressed logfile waiting for you ;)
> >
> > Yes, that'd be great. If you could include the time when you saw the guest
> > hang that'd be ideal. I'm not sure if this is one or two bugs,
> > but it seems likely it's a bug in rbd and not qemu.
> >
> > Thanks!
> > Josh
> >
> > > Thnx in advance,
> > >
> > > Oliver.
> > >
> > > On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote:
> > > > On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:
> > > > > Am 02.08.2013 um 23:47 schrieb Mike Dawson <address@hidden>:
> > > > > > We can "un-wedge" the guest by opening a NoVNC session or running a
> > > > > > 'virsh screenshot' command. After that, the guest resumes and runs
> > > > > > as expected. At that point we can examine the guest. Each time we'll
> > > > > > see:
> > > > If virsh screenshot works then this confirms that QEMU itself is still
> > > > responding. Its main loop cannot be blocked since it was able to
> > > > process the screendump command.
> > > >
> > > > This supports Josh's theory that a callback is not being invoked. The
> > > > virtio-blk I/O request would be left in a pending state.
> > > >
> > > > Now here is where the behavior varies between configurations:
> > > >
> > > > On a Windows guest with 1 vCPU, you may see the symptom that the guest
> > > > no
> > > > longer responds to ping.
> > > >
> > > > On a Linux guest with multiple vCPUs, you may see the hung task message
> > > > from the guest kernel because other vCPUs are still making progress.
> > > > Just the vCPU that issued the I/O request and whose task is in
> > > > UNINTERRUPTIBLE state would really be stuck.
> > > >
> > > > Basically, the symptoms depend not just on how QEMU is behaving but also
> > > > on the guest kernel and how many vCPUs you have configured.
> > > >
> > > > I think this can explain how both problems you are observing, Oliver and
> > > > Mike, are a result of the same bug. At least I hope they are :).
> > > >
> > > > Stefan
> > >
> > >
> >
>
>
> --
>
> Oliver Francke
>
> filoo GmbH
> Moltkestra?e 25a
> 33330 G?tersloh
> HRB4355 AG G?tersloh
>
> Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
>
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
>
> _______________________________________________
> ceph-users mailing list
> address@hidden
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], (continued)
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Stefan Hajnoczi, 2013/08/05
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Mike Dawson, 2013/08/05
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Sage Weil, 2013/08/13
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], James Harper, 2013/08/13
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Oliver Francke, 2013/08/08
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Josh Durgin, 2013/08/08
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Oliver Francke, 2013/08/09
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Andrei Mikhailovsky, 2013/08/09
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Stefan Hajnoczi, 2013/08/09
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686], Josh Durgin, 2013/08/10
- Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686],
Sage Weil <=