qemu-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-discuss] 答复: Latest Qemu-COLO Problems


From: wenzt
Subject: [Qemu-discuss] 答复: Latest Qemu-COLO Problems
Date: Wed, 13 Mar 2019 13:48:57 +0800

Your answer make sense to me.

Different network environment may result in that status.

I think more attention should be paid on the compatibility of COLO Proxy.

 

 

发件人: Zhang, Chen <address@hidden> 
发送时间: 2019年3月13日 11:49
收件人: wenzt <address@hidden>
抄送: 'qemu-discuss' <address@hidden>
主题: RE: Latest Qemu-COLO Problems

 

 

From: wenzt [mailto:address@hidden 
Sent: Wednesday, March 6, 2019 6:28 PM
To: Zhang, Chen <address@hidden <mailto:address@hidden> >
Cc: 'qemu-discuss' <address@hidden <mailto:address@hidden>
>
Subject: 答复: Latest Qemu-COLO Problems

 

I have tested Proxy with QMP: "{'execute': 'trace-event-set-state',
'arguments': {'name': 'colo*', 'enable': true} }"

 

I got this nothing except this logs on PVM side: 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : secondary: unsupported
packet in

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : secondary: unsupported
packet in

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : secondary: unsupported
packet in

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : primary: unsupported
packet in

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : secondary: unsupported
packet in

 

My guest OS is Centos 7.5.

I did nothing but boot up the OS.

After that, firing some net IO still get those logs.

 

I did some debug, maybe some parse error in parse_packet_early(), get the
wrong ETH_P_protocolName

 

Hi Zhengtao,

 

I think your test environment have some net issue, can you get IP in the
guest? Without COLO guest’s status?

Or you use Jiaoyuwang to test? network switch do some job in ETH level(like
vlan)?

In my side primary node proxy report like that:

 

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'checkpoint-request'
message

 

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'checkpoint-reply' message

 

{"timestamp": {"seconds": 1552455102, "microseconds": 148903}, "event":
"STOP"}

 

address@hidden:colo_vm_state_change
<mailto:address@hidden:colo_vm_state_change>  Change 'run' => 'stop'

 

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'vmstate-send'
message

 

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'vmstate-size'
message

 

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-received' message

 

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-loaded' message

 

{"timestamp": {"seconds": 1552455102, "microseconds": 277064}, "event":
"RESUME"}

 

address@hidden:colo_vm_state_change
<mailto:address@hidden:colo_vm_state_change>  Change 'stop' => 'run'

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : compare udp

 

address@hidden:colo_compare_ip_info
<mailto:address@hidden:colo_compare_ip_info>  ppkt size = 81, ip_src
= 10.239.161.136, ip_dst = 10.248.2.5, spkt size = 81, ip_src =
10.239.161.136, ip_dst = 10.248.2.5

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : packet same and release
packet

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : compare udp

 

address@hidden:colo_compare_ip_info
<mailto:address@hidden:colo_compare_ip_info>  ppkt size = 81, ip_src
= 10.239.161.136, ip_dst = 10.239.27.228, spkt size = 81, ip_src =
10.239.161.136, ip_dst = 10.239.27.228

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : packet same and release
packet

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : compare udp

 

address@hidden:colo_compare_ip_info
<mailto:address@hidden:colo_compare_ip_info>  ppkt size = 81, ip_src
= 10.239.161.136, ip_dst = 172.17.6.9, spkt size = 81, ip_src =
10.239.161.136, ip_dst = 172.17.6.9

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : packet same and release
packet

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : compare udp

 

address@hidden:colo_compare_ip_info
<mailto:address@hidden:colo_compare_ip_info>  ppkt size = 81, ip_src
= 10.239.161.136, ip_dst = 10.248.2.5, spkt size = 81, ip_src =
10.239.161.136, ip_dst = 10.248.2.5

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : packet same and release
packet

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : compare icmp

 

address@hidden:colo_compare_ip_info
<mailto:address@hidden:colo_compare_ip_info>  ppkt size = 157,
ip_src = 10.239.161.136, ip_dst = 172.17.6.9, spkt size = 157, ip_src =
10.239.161.136, ip_dst = 172.17.6.9

 

address@hidden:colo_compare_main
<mailto:address@hidden:colo_compare_main>  : packet same and release
packet

 

 

 

 

Thanks

Zhang Chen

 

 

Thanks,

Zhengtao

 

发件人: Zhang, Chen <address@hidden <mailto:address@hidden> > 
发送时间: 2019年3月5日 23:32
收件人: wenzt <address@hidden <mailto:address@hidden> >
抄送: 'qemu-discuss' <address@hidden
<mailto:address@hidden> >
主题: RE: Latest Qemu-COLO Problems

 

 

From: wenzt [mailto:address@hidden 
Sent: Thursday, February 28, 2019 10:00 AM
To: Zhang, Chen <address@hidden <mailto:address@hidden> >
Cc: 'qemu-discuss' <address@hidden <mailto:address@hidden>
>
Subject: 答复: Latest Qemu-COLO Problems

 

This version:  <https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode>
https://github.com/coloft/qemu/tree/colo-v4.1-periodic-mode

 

This is old version from 3 years ago, please drop it, use qemu upstream
codes.

 

Another question:

What is the relationship between Proxy and Checkpoint ?

 

When PVM and SVM send different net packet, proxy will send a signal to
COLO-frame to do a checkpoint.

 

Do they work together ? I guess we should set checkpoint interval longer
like 20s.

 

Yes, they work together, at the same time, we have periodic checkpoint
mechanism, like a timer. You can set the time too.

 

Does Proxy only works under network workload ? In my test, I feel like Proxy
not working.

 

Yes, as wiki said, colo-proxy compare the PVM and SVM packet to decide if do
checkpoint.

You can enable the COLO debug info to see proxy’s job in primary node like
this:

"{'execute': 'trace-event-set-state', 'arguments': {'name': 'colo*',
'enable': true} }"

 

 

Thanks

Zhang Chen

 

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月28日 9:34
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

Which version?

COLO project always said the PVM and SVM execute in parallel.

 

Thanks

Zhang Chen

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Thursday, February 28, 2019 9:21 AM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: 答复: Latest Qemu-COLO Problems

 

But in earlier version, I noticed that SVM always inmigration status even
doing checkpoint.

No operation can be performed on SVM. 

 

Thanks, 

Zhengtao

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月27日 18:45
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Wednesday, February 27, 2019 6:04 PM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: 答复: Latest Qemu-COLO Problems

 

Thanks for help !

 

I don’t know why we keep switching SVM between Run and Stop ?

Why we don’t keep SVM inmigration status ?

 

Because we need do checkpoint to sync all status between PVM and SVM.

We can’t guarantee that their status will be the same after a while.

 

Thanks

Zhang Chen

 

Thanks, 

Zhengtao

 

发件人: Zhang, Chen < <mailto:address@hidden> address@hidden> 
发送时间: 2019年2月26日 18:41
收件人: wenzt < <mailto:address@hidden> address@hidden>
抄送: 'qemu-discuss' < <mailto:address@hidden> address@hidden
org>
主题: RE: Latest Qemu-COLO Problems

 

By the way, please read the COLO wiki use this command to trigger failover
in secondary node:

 

{ 'execute': 'nbd-server-stop' }

{ "execute": "x-colo-lost-heartbeat" }

 

 

Thanks

Zhang Chen

 

From: Zhang, Chen 
Sent: Tuesday, February 26, 2019 2:46 PM
To: 'wenzt' < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: RE: Latest Qemu-COLO Problems

 

Sorry for slow response.

I have fixed this bug in this series:

 

 <https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html>
https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg06920.html

 

Please test it.

 

 

Thanks

Zhang Chen

 

From: wenzt [ <mailto:address@hidden> mailto:address@hidden 
Sent: Friday, February 15, 2019 7:54 PM
To: Zhang, Chen < <mailto:address@hidden> address@hidden>
Cc: 'qemu-discuss' < <mailto:address@hidden>
address@hidden>
Subject: Latest Qemu-COLO Problems

 

Hi Zhang,

 

I have tested COLO with qemu-3.1.0 follow
https://wiki.qemu.org/Features/COLO

 

I got this problems on PVM:

{"timestamp": {"seconds": 1550230616, "microseconds": 644348}, "event":
"STOP"}

{"timestamp": {"seconds": 1550230616, "microseconds": 719003}, "event":
"RESUME"}

{"timestamp": {"seconds": 1550230616, "microseconds": 743554}, "event":
"STOP"}

qemu-system-x86_64: Can't receive COLO message: Input/output error

qemu-system-x86_64: Can't receive COLO message: Input/output error

{"timestamp": {"seconds": 1550230618, "microseconds": 257209}, "event":
"COLO_EXIT", "data": {"mode": "primary", "reason": "error"}}

 

 

And on SVM:

{"timestamp": {"seconds": 1550230616, "microseconds": 731544}, "event":
"STOP"}

address@hidden:colo_vm_state_change
<mailto:address@hidden:colo_vm_state_change>  Change 'run' =>
'stop'

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'checkpoint-reply'
message

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-send' message

address@hidden:colo_flush_ram_cache_begin <mailto:address@hidden
759522:colo_flush_ram_cache_begin>  dirty_pages 18446744073708498780

address@hidden:colo_flush_ram_cache_end
<mailto:address@hidden:colo_flush_ram_cache_end>  

address@hidden:colo_receive_message
<mailto:address@hidden:colo_receive_message>  Receive
'vmstate-size' message

address@hidden:colo_send_message
<mailto:address@hidden:colo_send_message>  Send 'vmstate-received'
message

{"timestamp": {"seconds": 1550230616, "microseconds": 837436}, "event":
"RESUME"}

qemu-system-x86_64: block.c:5062: bdrv_detach_aio_context: Assertion
`!bs->walking_aio_notifiers' failed.

Aborted (core dumped)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]