qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Migration failure when running nested VMs


From: Dr. David Alan Gilbert
Subject: Re: Migration failure when running nested VMs
Date: Mon, 23 Sep 2019 11:42:45 +0100
User-agent: Mutt/1.12.1 (2019-06-15)

* Jintack Lim (address@hidden) wrote:
> Hi,

Copying in Paolo, since he recently did work to fix nested migration -
it was expected to be broken until pretty recently; but 4.1.0 qemu on
5.3 kernel is pretty new, so I think I'd expected it to work.

> I'm seeing VM live migration failure when a VM is running a nested VM.
> I'm using latest Linux kernel (v5.3) and QEMU (v4.1.0). I also tried
> v5.2, but the result was the same. Kernel versions in L1 and L2 VM are
> v4.18, but I don't think that matters.
> 
> The symptom is that L2 VM kernel crashes in different places after
> migration but the call stack is mostly related to memory management
> like [1] and [2]. The kernel crash happens almost all the time. While
> L2 VM gets kernel panic, L1 VM runs fine after the migration. Both L1
> and L2 VM were doing nothing during migration.
> 
> I found a few clues about this issue.
> 1) It happens with a relatively large memory for L1 (24G), but it does
> not with a smaller size (3G).
> 
> 2) Dead migration worked; when I ran "stop" command in the qemu
> monitor for L1 first and did migration, migration worked always. It
> also worked when I only stopped L2 VM and kept L1 live during the
> migration.
> 
> With those two clues, I guess maybe some dirty pages made by L2 are
> not transferred to the destination correctly, but I'm not really sure.
> 
> 3) It happens on Intel(R) Xeon(R) Silver 4114 CPU, but it does not on
> Intel(R) Xeon(R) CPU E5-2630 v3 CPU.
> 
> This makes me confused because I thought migrating nested state
> doesn't depend on the underlying hardware.. Anyways, L1-only migration
> with the large memory size (24G) works on both CPUs without any
> problem.
> 
> I would appreciate any comments/suggestions to fix this problem.

Can you share the qemu command lines you're using for both L1 and L2
please ?
Are there any dmesg entries around the time of the migration on either
the hosts or the L1 VMs?
What guest OS are you running in L1 and L2?

Dave

> Thanks,
> Jintack
> 
> 
> [1]https://paste.ubuntu.com/p/XGDKH45yt4/
> [2]https://paste.ubuntu.com/p/CpbVTXJCyc/
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]