Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses

From:	Richard Henderson
Subject:	Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
Date:	Thu, 12 Sep 2019 13:43:38 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 9/12/19 2:54 AM, Pavel Dovgalyuk wrote:
> Ping.
> 
> 
> Pavel Dovgalyuk
> 
>> -----Original Message-----
>> From: dovgaluk [mailto:address@hidden]
>> Sent: Monday, August 26, 2019 3:19 PM
>> To: Paolo Bonzini; address@hidden
>> Cc: address@hidden; Qemu-devel
>> Subject: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and 
>> accesses to dirty
>> bitmap
>>
>> This patch breaks the execution recording.
>> While vCPU tries to lock replay mutex in main while loop,
>> vga causes dirty memory sync and do_run_on_cpu call.
>> This call waits for vCPU to process the work queue.
>>
>> Pavel Dovgalyuk
>>
>> Paolo Bonzini писал 2019-08-20 09:59:
>>> There is a race between TCG and accesses to the dirty log:
>>>
>>>       vCPU thread                  reader thread
>>>       -----------------------      -----------------------
>>>       TLB check -> slow path
>>>         notdirty_mem_write
>>>           write to RAM
>>>           set dirty flag
>>>                                    clear dirty flag
>>>       TLB check -> fast path
>>>                                    read memory
>>>         write to RAM
>>>
>>> Fortunately, in order to fix it, no change is required to the
>>> vCPU thread.  However, the reader thread must delay the read after
>>> the vCPU thread has finished the write.  This can be approximated
>>> conservatively by run_on_cpu, which waits for the end of the current
>>> translation block.

If we are going to delay any read of the dirty flags until vCPU has completed
any active TranslationBlock, then we can simplify the TCG operation so that we
do not (ab)use the mmio path, and can promote this into the tlb slow path as we
have recently done with watchpoints.  C.f.

commit 50b107c5d617eaf93301cef20221312e7a986701
Author: Richard Henderson <address@hidden>
Date:   Sat Aug 24 09:51:09 2019 -0700

    cputlb: Handle watchpoints via TLB_WATCHPOINT

That would greatly simplify things from my perspective, for vector and
block-type operations such as we have recently been discussing for S390.  It
would mean that the *only* time we go through TLB_MMIO is for true mmio.

Have I understood your proposal here properly?


r~

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap, Pavel Dovgalyuk, 2019/09/12
- Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap, Richard Henderson <=
  - Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap, Paolo Bonzini, 2019/09/12
- Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap, Paolo Bonzini, 2019/09/12

Prev by Date: [Qemu-devel] [Bug 1841592] Re: ppc: softfloat float implementation issues
Next by Date: Re: [Qemu-devel] [PATCH v2 1/5] rcu: Add automatically released rcu_read_lock variant
Previous by thread: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
Next by thread: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap
Index(es):
- Date
- Thread