|
From: | LIU Zhiwei |
Subject: | Re: [PATCH] target/riscv: reduce overhead of MSTATUS_SUM change |
Date: | Wed, 22 Mar 2023 11:16:48 +0800 |
User-agent: | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 |
On 2023/3/22 10:47, Wu, Fei wrote:
On 3/22/2023 9:58 AM, LIU Zhiwei wrote:On 2023/3/22 0:10, Richard Henderson wrote:On 3/20/23 23:37, fei2.wu@intel.com wrote:From: Fei Wu <fei2.wu@intel.com> Kernel needs to access user mode memory e.g. during syscalls, the window is usually opened up for a very limited time through MSTATUS.SUM, the overhead is too much if tlb_flush() gets called for every SUM change. This patch saves addresses accessed when SUM=1, and flushs only these pages when SUM changes to 0. If the buffer is not large enough to save all the pages during SUM=1, it will fall back to tlb_flush when necessary. The buffer size is set to 4 since in this MSTATUS.SUM open-up window, most of the time kernel accesses 1 or 2 pages, it's very rare to see more than 4 pages accessed. It's not necessary to save/restore these new added status, as tlb_flush() is always called after restore. Result of 'pipe 10' from unixbench boosts from 223656 to 1327407. Many other syscalls benefit a lot from this one too.This is not the correct approach. You should be making use of different softmmu indexes, similar to how ARM uses a separate index for PAN (privileged access never) mode. If I read the manual properly, PAN == !SUM. When you do this, you need no additional flushing.Hi Fei, Let's follow Richard's advice. Yes, I'm thinking about how to do it, and thank Richard for the advice.My question is: * If we ensure this separate index (S+SUM) has no overlapping tlb entries with S-mode (ignore M-mode so far), during SUM=1,
Yes, every mmu index will have their own cache.
No, we have to choose one, because each access will be constrained with a mmu idex.we have to look into both (S+SUM) and S index for kernel address translation, that should be not desired.
* If all the tlb operations are against (S+SUM) during SUM=1, then (S+SUM) could contain some duplicated tlb entries of kernel address in S index, the duplication means extra tlb lookup and fill. Also if we want to flush tlb entry of specific addr0, we have to flush both index.
This is not the case. Zhiwei
I will take a look at how arm handles this. Thanks, Fei.Zhiweir~
[Prev in Thread] | Current Thread | [Next in Thread] |