[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Sudden slowdown of ARM emulation in master
From: |
Alex Bennée |
Subject: |
Re: Sudden slowdown of ARM emulation in master |
Date: |
Wed, 26 Feb 2020 15:29:16 +0000 |
User-agent: |
mu4e 1.3.8; emacs 27.0.60 |
Igor Mammedov <address@hidden> writes:
> On Wed, 26 Feb 2020 14:13:11 +0000
> Alex Bennée <address@hidden> wrote:
>
>> Peter Maydell <address@hidden> writes:
>>
>> > On Wed, 26 Feb 2020 at 09:19, Igor Mammedov <address@hidden> wrote:
>> >>
>> >> On Wed, 26 Feb 2020 00:07:55 +0100
>> >> Niek Linnenbank <address@hidden> wrote:
>> >>
>> >> > Hello Igor and Paolo,
>> >>
>> >> does following hack solves issue?
>> >>
>> >> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
>> >> index a08ab11f65..ab2448c5aa 100644
>> >> --- a/accel/tcg/translate-all.c
>> >> +++ b/accel/tcg/translate-all.c
>> >> @@ -944,7 +944,7 @@ static inline size_t size_code_gen_buffer(size_t
>> >> tb_size)
>> >> /* ??? If we relax the requirement that CONFIG_USER_ONLY use the
>> >> static buffer, we could size this on RESERVED_VA, on the text
>> >> segment size of the executable, or continue to use the
>> >> default. */
>> >> - tb_size = (unsigned long)(ram_size / 4);
>> >> + tb_size = MAX_CODE_GEN_BUFFER_SIZE;
>> >> #endif
>> >> }
>> >> if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>> >
>> > Cc'ing Richard to ask: does it still make sense for TCG
>> > to pick a codegen buffer size based on the guest RAM size?
>>
>> Arguably you would never get more than ram_size * tcg gen overhead of
>> active TBs at any one point although you can come up with pathological
>> patterns where only a subset of pages are flushed in and out at a time.
>>
>> However the backing for the code is mmap'ed anyway so surely the kernel
>> can work out the kinks here. We will never allocate more than the code
>> generator can generate jumps for anyway.
>>
>> Looking at the SoftMMU version of alloc_code_gen_buffer it looks like
>> everything now falls under the:
>>
>> # if defined(__PIE__) || defined(__PIC__)
>>
>> leg so there is a bunch of code to be deleted there. The remaining
>> question is what to do for linux-user because there is a bit more logic
>> to deal with some corner cases on the static code generation buffer.
>>
>> I'd be tempted to rename DEFAULT_CODE_GEN_BUFFER_SIZE to
>> SMALL_CODE_GEN_BUFFER_SIZE and only bother with a static allocation for
>> 32 bit linux-user hosts. Otherwise why not default to
>> MAX_CODE_GEN_BUFFER_SIZE on 64 bit systems and let the kernel deal with
>> it?
>
> *-user call
> tcg_exec_init(0);
> which in in the end results in
> DEFAULT_CODE_GEN_BUFFER_SIZE -> DEFAULT_CODE_GEN_BUFFER_SIZE_1
>
> so for *-user cases we can just always call
> code_gen_alloc(DEFAULT_CODE_GEN_BUFFER_SIZE)
<snip>
I've gone for a variation of that, coming to a mailing list near you
real soon now ;-)
--
Alex Bennée
- Re: Sudden slowdown of ARM emulation in master, (continued)