qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_i


From: Nicholas Piggin
Subject: Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
Date: Fri, 29 Mar 2019 19:13:55 +1000
User-agent: astroid/0.14.0 (https://github.com/astroidmail/astroid)

Suraj Jitindar Singh's on March 29, 2019 3:20 pm:
> On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:
>> On 3/27/19 5:37 PM, Cédric Le Goater wrote:
>> > On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:
>> > > With qemu-system-ppc64le -machine pseries -smp 4 I get:
>> > > 
>> > > > #  chrt 1 hackbench
>> > > > Running in process mode with 10 groups using 40 file
>> > > > descriptors each (== 400 tasks)
>> > > > Each sender will pass 100 messages of 100 bytes
>> > > > Oops: Exception in kernel mode, sig: 4 [#1]
>> > > > LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
>> > > > Modules linked in:
>> > > > CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
>> > > > NIP:  c000000000046978 LR: c000000000046a38 CTR:
>> > > > c0000000000b0150
>> > > > REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
>> > > > MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR:
>> > > > 42000874  XER: 00000000
>> > > > CFAR: c000000000046a34 IRQMASK: 1
>> > > > GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00
>> > > > 0000000028000000
>> > > 
>> > > …
>> > > > NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
>> > > > LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
>> > > > Call Trace:
>> > > > [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0
>> > > > (unreliable)
>> > > > [c0000001fffebba0] [c0000000000b0170]
>> > > > smp_pseries_cause_ipi+0x20/0x70
>> > > > [c0000001fffebbd0] [c00000000004b02c]
>> > > > arch_send_call_function_single_ipi+0x8c/0xa0
>> > > > [c0000001fffebbf0] [c0000000001de600]
>> > > > irq_work_queue_on+0xe0/0x130
>> > > > [c0000001fffebc30] [c0000000001340c8]
>> > > > rto_push_irq_work_func+0xc8/0x120
>> > > 
>> > > …
>> > > > Instruction dump:
>> > > > 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000
>> > > > 3929ffff
>> > > > 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020
>> > > > 3c4c00a2 38425080
>> > > > ---[ end trace eb842b544538cbdf ]---

This is unusual and causing powerpc code to crash because the rt
scheduler is telling irq_work_queue_on to queue work on this CPU.
Is that something allowed? There's no warnings in there but it must
be a rarely tested path, would it be better to ban it?

Steven is this queue_work_on to self by design?

>> > > 
>> > > and I was wondering whether this is a qemu bug or the kernel is
>> > > using an
>> > > opcode it should rather not. If I skip doorbell_try_core_ipi() in
>> > > smp_pseries_cause_ipi() then there is no crash. The comment says
>> > > "POWER9
>> > > should not use this handler" so…
>> > 
>> > I would say Linux is using a msgsndp instruction which is not
>> > implemented
>> > in QEMU TCG. But why have we started using dbells in Linux ? 
> 
> Yeah the kernel must have used msgsndp which isn't implemented for TCG
> yet. We use doorbells in linux but only for threads which are on the
> same core.
> And when I try to construct a situation with more than 1 thread per
> core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1
> thread/core on a pseries machine".
> 
> So I wonder why the guest thinks it can use msgsndp...

IPI to self evidently. Under TCG it really should implement the
instruction or remove the DBELL feature.

Thanks,
Nick




reply via email to

[Prev in Thread] Current Thread [Next in Thread]