[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Compiler memory consumption
From: |
Ludovic Courtès |
Subject: |
Compiler memory consumption |
Date: |
Tue, 16 May 2017 18:19:37 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) |
Hello!
Attached is a stripped-down version of Guix’s build-aux/compile-all.scm,
and here is the gcprof output I get while compiling
gnu/packages/python.scm, which defines 841 package objects (structs)
with 5 times more thunks of the form (lambda () value):
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env time guile --no-auto-compile profile-guilec.scm
GUILEC gnu/packages/python.go
% cumulative self
time seconds seconds procedure
35.00 22.55 22.55 vector-copy
6.67 17.18 4.30 language/cps/intset.scm:270:2:adjoin
5.00 6.44 3.22 srfi/srfi-1.scm:590:5:map1
3.33 31.14 2.15 ice-9/psyntax.scm:1521:10:rebuild-macro-output
3.33 2.15 2.15 gensym
3.33 2.15 2.15 language/tree-il/analyze.scm:544:4
3.33 2.15 2.15 make-struct
3.33 2.15 2.15 language/cps/intset.scm:187:0:persistent-intset
3.33 2.15 2.15 read
3.33 2.15 2.15 language/cps/intmap.scm:184:0:intmap-add!
1.67 44.03 1.07 ice-9/psyntax.scm:1604:10:parse
1.67 6.44 1.07 language/cps/intset.scm:269:0:intset-add
1.67 5.37 1.07 ice-9/psyntax.scm:1313:6:syntax-type
1.67 4.30 1.07 language/cps/intset.scm:547:2:union
1.67 4.30 1.07 language/cps/intset.scm:316:2:remove
1.67 2.15 1.07 language/cps/intset.scm:725:2:subtract-nodes
1.67 1.07 1.07 language/cps/intset.scm:178:0:transient-intset
1.67 1.07 1.07 ice-9/boot-9.scm:214:0:make-struct/no-tail
1.67 1.07 1.07 language/tree-il/compile-cps.scm:345:2:convert-args
1.67 1.07 1.07 cons
1.67 1.07 1.07 language/tree-il/compile-cps.scm:240:0:adapt-arity
1.67 1.07 1.07 string-append
1.67 1.07 1.07 append
1.67 1.07 1.07 hash-set!
1.67 1.07 1.07 language/cps/intset.scm:149:0:add-level
1.67 1.07 1.07 bytevector-u64-ref
1.67 1.07 1.07 ice-9/vlist.scm:449:0:vhash-cons
1.67 1.07 1.07 language/cps/renumber.scm:198:6
0.00 264932.27 0.00 language/cps/renumber.scm:78:4:visit
0.00 1101.88 0.00 language/tree-il.scm:425:2:lp
0.00 1041.74 0.00 language/tree-il/compile-cps.scm:323:0:convert
0.00 680.89 0.00 language/tree-il/peval.scm:716:2:loop
0.00 637.93 0.00 ice-9/boot-9.scm:228:5:map1
0.00 599.27 0.00 language/tree-il.scm:418:2:foldts
0.00 263.12 0.00 language/tree-il/debug.scm:39:2:visit
0.00 243.79 0.00 language/cps/intmap.scm:515:5:visit-branch
0.00 68.73 0.00 ice-9/boot-9.scm:2313:0:save-module-excursion
0.00 64.44 0.00 system/base/compile.scm:41:0:call-once
0.00 64.44 0.00 ice-9/boot-9.scm:842:2:with-throw-handler
0.00 64.44 0.00 ice-9/boot-9.scm:148:0:with-fluid*
0.00 64.44 0.00 system/base/compile.scm:153:8
0.00 64.44 0.00 system/base/compile.scm:210:0:read-and-compile
0.00 64.44 0.00 system/base/compile.scm:58:9
0.00 64.44 0.00 system/base/target.scm:51:0:with-target
0.00 64.44 0.00 system/base/compile.scm:135:0:compile-file
0.00 62.29 0.00 system/base/compile.scm:179:0:compile-fold
0.00 59.07 0.00 ice-9/boot-9.scm:2706:4
0.00 59.07 0.00 ice-9/boot-9.scm:2838:0:define-module*
0.00 59.07 0.00 primitive-load-path
0.00 59.07 0.00 ice-9/boot-9.scm:2751:0:resolve-interface
0.00 59.07 0.00 ice-9/boot-9.scm:2967:0:try-module-autoload
0.00 59.07 0.00 ice-9/boot-9.scm:2867:5
0.00 59.07 0.00 ice-9/boot-9.scm:2987:17
0.00 52.62 0.00 system/base/compile.scm:239:0:compile
0.00 50.48 0.00 language/cps/intset.scm:468:5:visit-branch
0.00 47.25 0.00 ice-9/psyntax.scm:2326:21:expand-let
0.00 37.59 0.00 language/cps/intmap.scm:247:2:adjoin
0.00 32.22 0.00 language/cps/compile-bytecode.scm:50:15
0.00 32.22 0.00
language/cps/compile-bytecode.scm:585:0:emit-bytecode
0.00 30.07 0.00
language/cps/slot-allocation.scm:841:0:allocate-slots
0.00 30.07 0.00
language/cps/compile-bytecode.scm:84:0:compile-function
0.00 24.70 0.00 language/cps/intmap.scm:246:0:intmap-add
0.00 22.55 0.00 language/cps/intmap.scm:104:0:clone-branch-and-set
0.00 17.18 0.00
language/cps/utils.scm:408:0:compute-sorted-strongly-connected-components
0.00 17.18 0.00
language/cps/slot-allocation.scm:195:0:compute-reverse-control-flow-order
0.00 13.96 0.00
language/cps/slot-allocation.scm:296:0:compute-live-variables
0.00 11.81 0.00 system/vm/assembler.scm:1024:0:intern-constant
0.00 11.81 0.00
language/cps/utils.scm:390:0:compute-strongly-connected-components
0.00 11.81 0.00 language/tree-il/compile-cps.scm:832:5:lp
0.00 10.74 0.00
language/cps/slot-allocation.scm:386:0:compute-lazy-vars
0.00 10.74 0.00 language/tree-il/compile-cps.scm:1082:0:compile-cps
0.00 9.67 0.00 ice-9/psyntax.scm:1078:6:expand-top-sequence
0.00 9.67 0.00
language/cps/compile-bytecode.scm:607:0:compile-bytecode
0.00 9.67 0.00 language/scheme/compile-tree-il.scm:29:3
0.00 7.52 0.00 ice-9/psyntax.scm:1150:30
0.00 6.44 0.00 language/cps/compile-bytecode.scm:594:0:lower-cps
0.00 6.44 0.00 language/cps/renumber.scm:161:0:renumber
0.00 5.37 0.00 language/cps/renumber.scm:112:0:compute-renaming
0.00 5.37 0.00 language/cps/renumber.scm:151:2:visit-fun
0.00 4.30 0.00 ice-9/psyntax.scm:1107:10:parse
0.00 4.30 0.00 language/tree-il/optimize.scm:31:0:optimize
0.00 3.22 0.00 ice-9/psyntax.scm:2931:8:match*
0.00 3.22 0.00
language/tree-il/compile-cps.scm:965:0:optimize-tree-il
0.00 3.22 0.00
language/cps/closure-conversion.scm:814:0:convert-closures
0.00 3.22 0.00 srfi/srfi-1.scm:458:2:fold
0.00 3.22 0.00 ice-9/psyntax.scm:1519:6:expand-macro
0.00 3.22 0.00 language/tree-il/analyze.scm:533:0:analyze-tree
0.00 3.22 0.00
language/cps/handle-interrupts.scm:56:0:add-handle-interrupts
0.00 2.15 0.00 srfi/srfi-1.scm:634:2:for-each
0.00 2.15 0.00 ice-9/psyntax.scm:2854:8:match-each
0.00 2.15 0.00 anon #x7f082ea18034
0.00 2.15 0.00 language/tree-il.scm:418:2:fold-values
0.00 2.15 0.00 language/cps/utils.scm:125:0:intmap-map
0.00 2.15 0.00 language/cps/closure-conversion.scm:349:2:visit-fun
0.00 2.15 0.00
language/tree-il/compile-cps.scm:934:0:cps-convert/thunk
0.00 2.15 0.00 srfi/srfi-1.scm:606:7:map2
0.00 2.15 0.00 ice-9/boot-9.scm:2067:0:call-with-deferred-observers
0.00 2.15 0.00 ice-9/eval.scm:292:11
0.00 2.15 0.00 language/cps/utils.scm:127:16
0.00 2.15 0.00 language/cps/intset.scm:466:2:intset-fold
0.00 2.15 0.00 language/cps/renumber.scm:53:2:visit
0.00 2.15 0.00 language/cps/intmap.scm:367:2:remove
0.00 2.15 0.00 anon #x7f0827a7c0e8
0.00 2.15 0.00 language/cps/utils.scm:394:4:visit-scc
0.00 2.15 0.00 system/base/compile.scm:203:0:read-and-parse
0.00 2.15 0.00 anon #x7f0827a600e8
0.00 1.07 0.00 language/tree-il/analyze.scm:984:0:validate-arity
0.00 1.07 0.00
language/cps/utils.scm:183:0:compute-defining-expressions
0.00 1.07 0.00 ice-9/psyntax.scm:2240:28:lp
0.00 1.07 0.00 ice-9/psyntax.scm:3084:2
0.00 1.07 0.00 language/cps/closure-conversion.scm:355:13
0.00 1.07 0.00 anon #x7f08265260e8
0.00 1.07 0.00 anon #x7f0826a990e8
0.00 1.07 0.00 language/cps/intset.scm:694:0:intset-subtract
0.00 1.07 0.00 language/cps/closure-conversion.scm:324:5
0.00 1.07 0.00 language/cps/utils.scm:202:0:compute-constant-values
0.00 1.07 0.00 language/cps/utils.scm:509:0:intset-pop
0.00 1.07 0.00 ice-9/psyntax.scm:1483:28
0.00 1.07 0.00 system/vm/debug.scm:162:0:debug-context-from-image
0.00 1.07 0.00 anon #x7f082696c0e8
0.00 1.07 0.00 anon #x7f08269470e8
0.00 1.07 0.00 anon #x7f08273920e8
0.00 1.07 0.00 anon #x7f0826f990e8
0.00 1.07 0.00 anon #x7f0826ee70e8
0.00 1.07 0.00 language/cps/closure-conversion.scm:318:2:add-uses
0.00 1.07 0.00 system/vm/linker.scm:357:0:add-symbols
0.00 1.07 0.00 anon #x7f08274e80e8
0.00 1.07 0.00 language/cps/utils.scm:432:2:component-successors
0.00 1.07 0.00 anon #x7f08268b70e8
0.00 1.07 0.00 system/vm/debug.scm:220:0:find-program-debug-info
0.00 1.07 0.00 ice-9/boot-9.scm:2628:0:module-gensym
0.00 1.07 0.00 anon #x7f08270220e8
0.00 1.07 0.00 anon #x7f0826cc40e8
0.00 1.07 0.00 anon #x7f0826f2f0e8
0.00 1.07 0.00 anon #x7f08269000e8
0.00 1.07 0.00 anon #x7f0826d960e8
0.00 1.07 0.00 system/vm/program.scm:53:0:program-name
0.00 1.07 0.00 anon #x7f0826d280e8
0.00 1.07 0.00 anon #x7f0826efd0e8
0.00 1.07 0.00 anon #x7f0826a060e8
0.00 1.07 0.00 anon #x7f082719e0e8
0.00 1.07 0.00 guix/memoization.scm:58:0
0.00 1.07 0.00 anon #x7f08267e00e8
0.00 1.07 0.00 anon #x7f08275440e8
0.00 1.07 0.00 anon #x7f08270590e8
0.00 1.07 0.00 language/cps/intset.scm:315:0:intset-remove
0.00 1.07 0.00 anon #x7f08272a60e8
0.00 1.07 0.00 anon #x7f082780a0e8
0.00 1.07 0.00 language/cps/intset.scm:470:5:visit-branch
0.00 1.07 0.00 system/vm/elf.scm:828:0:elf-section-by-name
0.00 1.07 0.00
language/cps/closure-conversion.scm:66:0:filter-reachable
0.00 1.07 0.00
language/cps/slot-allocation.scm:536:0:compute-shuffles
0.00 1.07 0.00 anon #x7f08266af0e8
0.00 1.07 0.00 anon #x7f08251ab0e8
0.00 1.07 0.00 anon #x7f082706f0e8
0.00 1.07 0.00 anon #x7f08273e60e8
0.00 1.07 0.00 language/cps/utils.scm:514:0:solve-flow-equations
0.00 1.07 0.00
language/cps/slot-allocation.scm:636:2:compute-shuffles
0.00 1.07 0.00
language/cps/slot-allocation.scm:487:0:solve-parallel-move
0.00 1.07 0.00 system/vm/linker.scm:635:0:allocate-elf
0.00 1.07 0.00 ice-9/psyntax.scm:2968:8:match
0.00 1.07 0.00 anon #x7f08274d40e8
0.00 1.07 0.00 anon #x7f082689f0e8
0.00 1.07 0.00 anon #x7f0826f430e8
0.00 1.07 0.00 anon #x7f08273aa0e8
0.00 1.07 0.00 anon #x7f0826f130e8
0.00 1.07 0.00 anon #x7f08272d00e8
0.00 1.07 0.00 language/cps/slot-allocation.scm:415:17
0.00 1.07 0.00 anon #x7f08268820e8
0.00 1.07 0.00 anon #x7f08268ce0e8
0.00 1.07 0.00 anon #x7f08269e00e8
0.00 1.07 0.00 anon #x7f0826d0d0e8
0.00 1.07 0.00 ice-9/psyntax.scm:1411:6:expand-expr
0.00 1.07 0.00 system/vm/linker.scm:371:0:allocate-segment
0.00 1.07 0.00 anon #x7f0826ffb0e8
0.00 1.07 0.00 language/cps/intset.scm:496:0:intset-union
0.00 1.07 0.00 anon #x7f0826cd70e8
0.00 1.07 0.00 system/vm/elf.scm:675:0:parse-elf64-section-header
0.00 1.07 0.00 system/vm/assembler.scm:1144:0:emit-load-constant
0.00 1.07 0.00 anon #x7f082721c0e8
0.00 1.07 0.00 anon #x7f08273d30e8
0.00 1.07 0.00 anon #x7f08251c70e8
0.00 1.07 0.00 anon #x7f08273c00e8
0.00 1.07 0.00 anon #x7f08272e60e8
0.00 1.07 0.00 anon #x7f0826cab0e8
0.00 1.07 0.00 anon #x7f082733b0e8
0.00 1.07 0.00 ice-9/psyntax.scm:2336:49
0.00 1.07 0.00 language/cps/intset.scm:204:0:intset-add!
0.00 1.07 0.00 anon #x7f0826fcc0e8
0.00 1.07 0.00 system/vm/linker.scm:706:0:link-elf
0.00 1.07 0.00 anon #x7f0826d5b0e8
0.00 1.07 0.00 procedure-name
0.00 1.07 0.00 anon #x7f08272b80e8
0.00 1.07 0.00 language/cps/compile-bytecode.scm:539:4:compile-cont
0.00 1.07 0.00 ice-9/psyntax.scm:272:4
0.00 1.07 0.00 language/cps/intmap.scm:366:0:intmap-remove
0.00 1.07 0.00 language/tree-il/peval.scm:415:9
0.00 1.07 0.00 anon #x7f0826fe70e8
0.00 1.07 0.00 ice-9/vlist.scm:254:0:vlist-fold
0.00 1.07 0.00 anon #x7f0826c520e8
---
Sample count: 60
Total time: 64.4376717 seconds (19.803807452 seconds in GC)
64.64user 0.18system 0:56.76elapsed 114%CPU (0avgtext+0avgdata
3966576maxresident)k
0inputs+7208outputs (0major+78248minor)pagefaults 0swaps
--8<---------------cut here---------------end--------------->8---
time(1) reports a maximum resident set size of 3.8G (though I see
something around 900MiB in ‘top’.)
‘vector-copy’ calls probably come from (language cps intmap) or
(language cps types). Unfortunately the profile tells us what part of
the code is allocating, but it doesn’t tell us if our memory is actually
full of vectors.
(I looked at
<https://wingolog.org/archives/2014/07/01/flow-analysis-in-guile> but
I’m not really sure how to check whether intmaps/intsets are to blame.)
I generated a source file with this to mimic the code in python.scm
after macro expansion:
--8<---------------cut here---------------start------------->8---
(define body
'(let* ((a "a")(b 'b)(c 'c)(d (lambda () 'd))(e (lambda () 'e))
(f (lambda () 'f)) (g (lambda () 'g)))
(let ((s (allocate-struct <foo> 20)))
(struct-set! s 0 a)
(struct-set! s 1 b)
(struct-set! s 2 c)
(struct-set! s 3 d)
(struct-set! s 4 e)
(struct-set! s 5 f)
(struct-set! s 6 g)
s)))
(let loop ((i 850))
(unless (zero? i)
(display
(format #f "(define-public p~a ~s)\n"
i body))
(loop (1- i))))
--8<---------------cut here---------------end--------------->8---
‘vector-copy’ also comes first in the GC profile. It runs in 105s (only
3s spent in GC) with 1.2G max RSS. If we simplify the code (omit ‘let*’
or put fewer variables in there, use ‘make-struct’ instead of
‘allocate-struct’ + ‘struct-set!’, etc.) memory consumption drops a
bit.
When compiling python.scm #:to 'cps, we end up with 1G max RSS in 6s.
Also, for reference, loading python.go peaks at 315M RSS:
--8<---------------cut here---------------start------------->8---
$ \time ./pre-inst-env guile -c '(use-modules (gnu packages python))'
0.18user 0.02system 0:00.18elapsed 112%CPU (0avgtext+0avgdata
315648maxresident)k
0inputs+0outputs (0major+7784minor)pagefaults 0swaps
--8<---------------cut here---------------end--------------->8---
The only conclusion I can draw is that cps-to-bytecode compilation seems
to be responsible for most of the memory consumption.
Thoughts on all this?
Thanks,
Ludo’.
profile-guilec.scm
Description: the compilation script
- Compiler memory consumption,
Ludovic Courtès <=