qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

About migration/colo issue


From: Zhang, Chen
Subject: About migration/colo issue
Date: Fri, 15 May 2020 03:16:18 +0000

Hi Hailiang/Dave.

 

I found a urgent problem in current upstream code, COLO will stuck on secondary checkpoint and later.

The guest will stuck by this issue.

I have bisect upstream code, this issue caused by Hailiang’s optimize patch:

 

From 0393031a16735835a441b6d6e0495a1bd14adb90 Mon Sep 17 00:00:00 2001
From: zhanghailiang <address@hidden>
Date: Mon, 24 Feb 2020 14:54:10 +0800
Subject: [PATCH] COLO: Optimize memory back-up process

This patch will reduce the downtime of VM for the initial process,
Previously, we copied all these memory in preparing stage of COLO
while we need to stop VM, which is a time-consuming process.
Here we optimize it by a trick, back-up every page while in migration
process while COLO is enabled, though it affects the speed of the
migration, but it obviously reduce the downtime of back-up all SVM'S
memory in COLO preparing stage.

Signed-off-by: zhanghailiang <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>
  minor typo fixes

 

Hailiang, do you have time to look into it?

 

 

The detail log:

Primary node:

13322@1589511271.917346:colo_receive_message Receive 'checkpoint-ready' message
{"timestamp": {"seconds": 1589511271, "microseconds": 917406}, "event": "RESUME"}
13322@1589511271.917842:colo_vm_state_change Change 'stop' => 'run'
13322@1589511291.243346:colo_send_message Send 'checkpoint-request' message
13322@1589511291.243978:colo_receive_message Receive 'checkpoint-reply' message
{"timestamp": {"seconds": 1589511291, "microseconds": 244096}, "event": "STOP"}
13322@1589511291.244444:colo_vm_state_change Change 'run' => 'stop'
13322@1589511291.244561:colo_send_message Send 'vmstate-send' message
13322@1589511291.258594:colo_send_message Send 'vmstate-size' message
13322@1589511305.412479:colo_receive_message Receive 'vmstate-received' message
13322@1589511309.031826:colo_receive_message Receive 'vmstate-loaded' message
{"timestamp": {"seconds": 1589511309, "microseconds": 31862}, "event": "RESUME"}
13322@1589511309.033075:colo_vm_state_change Change 'stop' => 'run'
{"timestamp": {"seconds": 1589511311, "microseconds": 111617}, "event": "VNC_CONNECTED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5907", "host": "0.0.0.0", "websocket": false}, "client": {"family": "ipv4", "service": "51564", "host": "10.239.13.19", "websocket": false}}}
{"timestamp": {"seconds": 1589511311, "microseconds": 116197}, "event": "VNC_INITIALIZED", "data": {"server": {"auth": "none", "family": "ipv4", "service": "5907", "host": "0.0.0.0", "websocket": false}, "client": {"family": "ipv4", "service": "51564", "host": "10.239.13.19", "websocket": false}}}
13322@1589511311.243271:colo_send_message Send 'checkpoint-request' message
13322@1589511311.351361:colo_receive_message Receive 'checkpoint-reply' message
{"timestamp": {"seconds": 1589511311, "microseconds": 351439}, "event": "STOP"}
13322@1589511311.415779:colo_vm_state_change Change 'run' => 'stop'
13322@1589511311.416001:colo_send_message Send 'vmstate-send' message
13322@1589511311.418620:colo_send_message Send 'vmstate-size' message

 

Secondary node:

{"timestamp": {"seconds": 1589510920, "microseconds": 778207}, "event": "RESUME"}
23619@1589510920.778835:colo_vm_state_change Change 'stop' => 'run'
23619@1589510920.778891:colo_send_message Send 'checkpoint-ready' message
23619@1589510940.105539:colo_receive_message Receive 'checkpoint-request' message
{"timestamp": {"seconds": 1589510940, "microseconds": 105712}, "event": "STOP"}
23619@1589510940.105917:colo_vm_state_change Change 'run' => 'stop'
23619@1589510940.105971:colo_send_message Send 'checkpoint-reply' message
23619@1589510940.106767:colo_receive_message Receive 'vmstate-send' message
23619@1589510940.122808:colo_flush_ram_cache_begin dirty_pages 2456
23619@1589510953.618672:colo_flush_ram_cache_end
23619@1589510953.945083:colo_receive_message Receive 'vmstate-size' message
23619@1589510954.274816:colo_send_message Send 'vmstate-received' message
qemu-system-x86_64: warning: TSC frequency mismatch between VM (2792980 kHz) and host (2925999 kHz), and TSC scaling unavailable
{"timestamp": {"seconds": 1589510957, "microseconds": 754184}, "event": "RESUME"}
23619@1589510957.894113:colo_vm_state_change Change 'stop' => 'run'
23619@1589510957.894162:colo_send_message Send 'vmstate-loaded' message
23619@1589510960.105977:colo_receive_message Receive 'checkpoint-request' message
{"timestamp": {"seconds": 1589510960, "microseconds": 106148}, "event": "STOP"}
23619@1589510960.213773:colo_vm_state_change Change 'run' => 'stop'
23619@1589510960.213797:colo_send_message Send 'checkpoint-reply' message
23619@1589510960.278771:colo_receive_message Receive 'vmstate-send' message
23619@1589510960.423268:colo_flush_ram_cache_begin dirty_pages 25

 

 

 

 

 

 

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]