qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TB Cache size grows out of control with qemu 5.0


From: Alex Bennée
Subject: Re: TB Cache size grows out of control with qemu 5.0
Date: Thu, 16 Jul 2020 17:27:14 +0100
User-agent: mu4e 1.5.4; emacs 28.0.50

Christian Ehrhardt <christian.ehrhardt@canonical.com> writes:

> On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan <balaton@eik.bme.hu> wrote:
>
>> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two
>> following it.
>>
>
> Thank you Zoltan for pointing out this commit, I agree that this seems to be
> the trigger for the issues I'm seeing. Unfortunately the common CI host size
> is 1-2G. For example on Ubuntu Autopkgtests 1.5G.
> Those of them running guests do so in 0.5-1G size in TCG mode
> (as they often can't rely on having KVM available).
>
> The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing
> on memory pressure (never existed) makes these systems go OOM-Killing
> the qemu process.

Ooops. I admit the assumption was that most people running system
emulation would be doing it on beefier machines.

> The patches indicated that the TB flushes on a full guest boot are a
> good indicator of the TB size efficiency. From my old checks I had:
>
> - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4
> TB flush count      14, 14, 16
> - Qemu 5.0 512M guest with 1G default
> TB flush count      1, 1, 1
>
> I agree that ram/4 seems odd, especially on huge guests that is a lot
> potentially wasted. And most environments have a bit of breathing
> room 1G is too big in small host systems and the common CI system falls
> into this category. So I tuned it down to 256M for a test.
>
> - Qemu 4.2 512M guest with tb-size 256M
> TB flush count      5, 5, 5
> - Qemu 5.0 512M guest with tb-size 256M
> TB flush count      5, 5, 5
> - Qemu 5.0 512M guest with 256M default in code
> TB flush count      5, 5, 5
>
> So performance wise the results are as much in-between as you'd think from a
> TB size in between. And the memory consumption which (for me) is the actual
> current issue to fix would be back in line again as expected.

So I'm glad you have the workaround. 

>
> So on one hand I'm suggesting something like:
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re
>   * Users running large scale system emulation may want to tweak their
>   * runtime setup via the tb-size control on the command line.
>   */
> -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
> +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB)

The problem we have is any number we pick here is arbitrary. And while
it did regress your use-case changing it again just pushes a performance
regression onto someone else. The most (*) 64 bit desktop PCs have 16Gb
of RAM, almost all have more than 8gb. And there is a workaround.

* random number from Steams HW survey.

>  #endif
>  #endif
>
> OTOH I understand someone else might want to get the more speedy 1G
> especially for large guests. If someone used to run a 4G guest in TCG the
> TB Size was 1G all along.
> How about picking the smaller of (1G || ram-size/4) as default?
>
> This might then look like:
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe
>  {
>      /* Size the buffer.  */
>      if (tb_size == 0) {
> -        tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +        unsigned long max_default = (unsigned long)(ram_size / 4);
> +        if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) {
> +            tb_size = max_default;
> +        } else {
> +           tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +        }
>      }
>      if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>          tb_size = MIN_CODE_GEN_BUFFER_SIZE;
>
> This is a bit more tricky than it seems as ram_sizes is no more
> present in that context but it is enough to discuss it.
> That should serve all cases - small and large - better as a pure
> static default of 1G or always ram/4?

I'm definitely against re-introducing ram_size into the mix. The
original commit (a1b18df9a4) that broke this introduced an ordering
dependency which we don't want to bring back.

I'd be more amenable to something that took into account host memory and
limited the default if it was smaller than a threshold. Is there a way
to probe that that doesn't involve slurping /proc/meminfo?

>
> P.S. I added Alex being the Author of the offending patch and Richard/Paolo
> for being listed in the Maintainers file for TCG.


-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]