[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] backup bug or question
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [Qemu-devel] backup bug or question |
Date: |
Mon, 12 Aug 2019 17:59:40 +0000 |
12.08.2019 20:46, John Snow wrote:
>
>
> On 8/10/19 7:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 09.08.2019 23:13, John Snow wrote:
>>>
>>>
>>> On 8/9/19 9:18 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> Hi!
>>>>
>>>> Hmm, hacking around backup I have a question:
>>>>
>>>> What prevents guest write request after job_start but before setting
>>>> write notifier?
>>>>
>>>> code path:
>>>>
>>>> qmp_drive_backup or transaction with backup
>>>>
>>>> job_start
>>>> aio_co_enter(job_co_entry) /* may only schedule execution, isn't
>>>> it ? */
>>>>
>>>> ....
>>>>
>>>> job_co_entry
>>>> job_pause_point() /* it definitely yields, isn't it bad? */
>>>> job->driver->run() /* backup_run */
>>>>
>>>> ----
>>>>
>>>> backup_run()
>>>> bdrv_add_before_write_notifier()
>>>>
>>>> ...
>>>>
>>>
>>> I think you're right... :(
>>>
>>>
>>> We create jobs like this:
>>>
>>> job->paused = true;
>>> job->pause_count = 1;
>>>
>>>
>>> And then job_start does this:
>>>
>>> job->co = qemu_coroutine_create(job_co_entry, job);
>>> job->pause_count--;
>>> job->busy = true;
>>> job->paused = false;
>>>
>>>
>>> Which means that job_co_entry is being called before we lift the pause:
>>>
>>> assert(job && job->driver && job->driver->run);
>>> job_pause_point(job);
>>> job->ret = job->driver->run(job, &job->err);
>>>
>>> ...Which means that we are definitely yielding in job_pause_point.
>>>
>>> Yeah, that's a race condition waiting to happen.
>>>
>>>> And what guarantees we give to the user? Is it guaranteed that write
>>>> notifier is
>>>> set when qmp command returns?
>>>>
>>>> And I guess, if we start several backups in a transaction it should be
>>>> guaranteed
>>>> that the set of backups is consistent and correspond to one point in
>>>> time...
>>>>
>>>
>>> I would have hoped that maybe the drain_all coupled with the individual
>>> jobs taking drain_start and drain_end would save us, but I guess we
>>> simply don't have a guarantee that all backup jobs WILL have installed
>>> their handler by the time the transaction ends.
>>>
>>> Or, if there is that guarantee, I don't know what provides it, so I
>>> think we shouldn't count on it accidentally working anymore.
>>>
>>>
>>>
>>> I think we should do two things:
>>>
>>> 1. Move the handler installation to creation time.
>>> 2. Modify backup_before_write_notify to return without invoking
>>> backup_do_cow if the job isn't started yet.
>>>
>>
>> Hmm, I don't see, how it helps.. No-op write-notifier will not save as from
>> guest write, is it?
>>
>>
>
> The idea is that by installing the write notifier during creation, the
> write notifier can be switched on the instant job_start is created,
> regardless of if we yield in the co_entry shim or not.
>
> That way, no matter when we yield or when the backup_run coroutine
> actually gets scheduled and executed, the write notifier is active already.
>
> Or put another way: calling job_start() guarantees that the write
> notifier is active.
Oh, got it, feel stupid)
>
> I think using filters will save us too, but I don't know how ready those
> are. Do we still want a patch that guarantees this behavior in the meantime?
>
I think we want of course, will look at it tomorrow.
--
Best regards,
Vladimir