[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Migration very slow on block copy
From: |
Adrien G |
Subject: |
Migration very slow on block copy |
Date: |
Tue, 11 Aug 2020 16:43:46 +0200 |
Hi,
I'm doing a live migration with a non shared storage on QEMY 4.2.0.
Unfortunately the block migration is terribly slow and I can't find why.
It does not saturate the network link and use only in average 70Mbps on
1000Mbps.
Both source and destination hosts seem OK, they are not loaded and no CPU
saturate.
I have tried to send the block image using scp to check the disk speed &
network at the same time.
It works well and I saturate the 1000Mbps link, meaning the network and disk
seem to work well.
On destination, I've created an empty QCOW2 image, then start QEMU with
"-incoming defer" argument.
On source, I start the migration with { execute: 'migrate', arguments: { uri:
`tcp:<ip>:<port>`, blk: true } }.
I have set these options on both source and destination:
"query-migrate-capabilities":
- auto-converge: true
- zero-blocks: true
- events: true
- postcopy-ram: true
- block: true
- return-path: true
- postcopy-blocktime: true
- validate-uuid: true
- xbzrle: false
- rdma-pin-all: false
- compress: false
- x-colo: false
- release-ram: false
- pause-before-switchover: false
- multifd: false
- dirty-bitmaps: false
- late-block-activate: false
- x-ignore-shared: false
"query-migrate-parameters":
- downtime-limit: 1000
- max-bandwidth: 18446744073709551615 (=> -1)
- cpu-throttle-initial: 1
- cpu-throttle-increment: 20
- max-cpu-throttle: 1
- xbzrle-cache-size: 67108864
- announce-max: 550
- announce-initial: 50
- announce-rounds: 5
- announce-step: 100
- decompress-threads: 2
- compress-threads: 8
- compress-level: 1
- compress-wait-thread: false
- multifd-channels: 2
- block-incremental: false
- tls-authz: ""
- tls-creds: ""
- tls-hostname: ""
- max-postcopy-bandwidth: 0
- x-checkpoint-delay: 60000
"query-migrate":
"status": "active",
"setup-time": 2129,
"total-time": 25257536,
"expected-downtime": 1000,
"disk": {
"total": 864362168320,
"remaining": 495024340992,
"transferred": 369337827328
},
"ram": {
"total": 103084531712,
"postcopy-requests": 0,
"dirty-sync-count": 1,
"multifd-bytes": 0,
"pages-per-second": 0,
"page-size": 4096,
"remaining": 103084531712,
"mbps": 121.521105, => SLOW, way under the 1000Mbps link
"transferred": 544729048,
"duplicate": 0,
"dirty-pages-rate": 0,
"skipped": 0,
"normal-bytes": 0,
"normal": 0
}
strace (~5 seconds) on source:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
52.37 0.559057 1440 388 ppoll
41.46 0.442608 937 472 io_submit
4.17 0.044529 20 2172 clock_gettime
1.86 0.019809 54 363 read
0.10 0.001018 2 420 write
0.02 0.000258 2 94 1 futex
0.02 0.000237 3 79 munmap
0.00 0.000002 0 4 sendmsg
------ ----------- ----------- --------- --------- ----------------
100.00 1.067518 267 3992 1 total
The futext errors are "futex(0x1246dc0, FUTEX_WAIT_PRIVATE, 2, NULL) = -1
EAGAIN (Resource temporarily unavailable)"
strace (~5 seconds) on destination:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
54.49 0.077527 106 730 ppoll
21.06 0.029972 21 1407 68 recvmsg
15.50 0.022050 15 1460 gettimeofday
4.69 0.006676 10 612 futex
3.86 0.005486 8 681 read
0.41 0.000579 4 136 write
------ ----------- ----------- --------- --------- ----------------
100.00 0.142290 28 5026 68 total
The recvmsg errors are "recvmsg(44, {msg_namelen=0}, MSG_CMSG_CLOEXEC) = -1
EAGAIN (Resource temporarily unavailable)".
I have done a lot of tests but I'm now at a dead end and don't have any more
idea to understand why it is so slow.
Does anyone have any idea?
Best,
Adrien
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Migration very slow on block copy,
Adrien G <=