[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-t
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing |
Date: |
Sun, 7 Oct 2018 18:47:20 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 |
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> -#define CPU_COMMON_TLB \
> +typedef struct CPUTLBDesc {
> + size_t size;
> + size_t mask; /* (.size - 1) << CPU_TLB_ENTRY_BITS for TLB fast path */
> +} CPUTLBDesc;
I think you don't need both size and mask. Size is (or ought to be) used
infrequently enough that I think it can be computed on demand. But on a
related note, if you remember the out-of-line tlb changes, it'll be easier for
x86 if you index an array of scalars.
> - int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
> + int index = (addr >> TARGET_PAGE_BITS) & (env->tlb_desc[mmu_idx].size -
> 1);
I just posted a patch to functionalize this. But I imagine
static inline uintptr_t tlb_index(CPUArchState *env, uintptr_t mmu_idx,
target_ulong addr)
{
uintptr_t ofs = (addr >> TARGET_PAGE_BITS) & env->tlb_mask[mmu_idx];
return ofs >> CPU_TLB_BITS;
}
static inline CPUTLBEntry *tlb_entry(CPUArchState *env, uintptr_t mmu_idx,
target_ulong addr)
{
uintptr_t ofs = (addr >> TARGET_PAGE_BITS) & env->tlb_mask[mmu_idx];
return (void *)&env->tlb_table[mmu_idx][0] + ofs;
}
In the few places where we use both of these functions, the compiler ought to
be able to pull out the common sub-expression.
> @@ -139,7 +149,10 @@ static void tlb_flush_nocheck(CPUState *cpu)
> * that do not hold the lock are performed by the same owner thread.
> */
> qemu_spin_lock(&env->tlb_lock);
> - memset(env->tlb_table, -1, sizeof(env->tlb_table));
> + for (i = 0; i < NB_MMU_MODES; i++) {
> + memset(env->tlb_table[i], -1,
> + env->tlb_desc[i].size * sizeof(CPUTLBEntry));
Here we can use something like
static inline uintptr_t sizeof_tlb(CPUArchState *env, uintptr_t mmu_idx)
{
return env->tlb_mask[mmu_idx] + (1 << CPU_TLB_BITS);
}
memset(env->tlb_table[i], -1, sizeof_tlb(env, i));
> - for (i = 0; i < CPU_TLB_SIZE; i++) {
> + for (i = 0; i < env->tlb_desc[mmu_idx].size; i++) {
This seems to be one of the few places where you need the number of entries
rather than the size. Perhaps
for (i = sizeof_tlb(env, mmu_idx) >> CPU_TLB_BITS; i-- > 0; ) {
Or if you can think of a name for the function of the same effect...
r~
[Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb, Emilio G. Cota, 2018/10/06
[Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing, Emilio G. Cota, 2018/10/06
- Re: [Qemu-devel] [RFC 1/6] (XXX) cputlb: separate MMU allocation + run-time sizing,
Richard Henderson <=