On Thu, Nov 17, 2022 at 07:07:10PM +0200, Avihai Horon wrote:
+ }
+
+ if (mig_state->data_fd != -1) {
+ if (migration->data_fd != -1) {
+ /*
+ * This can happen if the device is asynchronously reset and
+ * terminates a data transfer.
+ */
+ error_report("%s: data_fd out of sync", vbasedev->name);
+ close(mig_state->data_fd);
+
+ return -1;
Should we go to recover_state here? Is migration->device_state
invalid? -EBADF?
Yes, we should.
Although VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE ioctl above succeeded, setting
the device state didn't *really* succeed, as the data_fd went out of sync.
So we should go to recover_state and return -EBADF.
The state did succeed and it is now "new_state". Getting an
unexpected data_fd means it did something like RUNNING->PRE_COPY_P2P
when the code was expecting PRE_COPY->PRE_COPY_P2P.
It is actually in PRE_COPY_P2P but the in-progress migration must be
stopped and the kernel would have made the migration->data_fd
permanently return some error when it went async to RUNNING.
The recovery is to resart the migration (of this device?) from the
start.