[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compiler memory consumption

From: Ludovic Courtès
Subject: Re: Compiler memory consumption
Date: Mon, 30 Oct 2017 16:52:05 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)


Andy Wingo <address@hidden> skribis:

> On Wed 25 Oct 2017 19:42, address@hidden (Ludovic Courtès) writes:
>> In short, compiling the top-level thunk is what’s killing us, because
>> the space and time complexity is proportional to the number of labels
>> therein.
>> Also, our 16K line python.scm file translates into 428K labels, which
>> during slot allocation translates into a dozen of 428K-element intmaps
>> and intsets.  So the compilation cost is space per source line of code
>> is high.
>> Andy, what are your thoughts?
> I think that probably these 428K labels are partitioned into a number of
> functions.  It is true that the biggest one takes the most time.  Should
> we attempt to speed this up, or should we try harder to simplify this
> graph even at low optimization levels (e.g. "simplify" pass), or should
> we avoid CSE and periodically split liveness ranges, or should we use a
> different register allocation strategy on large functions?

The space taken by the data structures used to represent functions and
live variables is huge: we’re at 1+ GiB heap after the
‘compute-live-variables’ call for the big top-level function (compiling
the 4K subsequent functions, which are small, does not increase heap
size and is very fast).  I think it’s primarily a space complexity

Ideally memory consumption would not be proportional to the number of
definitions in a file (top-level statements).  Periodically splitting
liveness ranges could probably help avoid this pathological behavior: it
would place an upper bound on memory consumption.

Hopefully this wouldn’t have any noticeable impact on the generated
code, definitely not for top-level functions like that one.


I’m not sure what it takes to implement this splitting, though!

> Or would a transversal solution like a JIT actually be the solution?
> Or does the stack marking show the same pessimal behavior as the weak
> table, in that it's a large object behind a mark function?

My impression is that these are secondary issues.

Thanks for your feedback!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]