guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile performance - Ackermann function: way slower than emacs, slow


From: Andy Wingo
Subject: Re: guile performance - Ackermann function: way slower than emacs, slower still if compiled
Date: Wed, 05 Aug 2009 11:06:01 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)

Hi Ken,

On Tue 04 Aug 2009 11:28, Ken Raeburn <address@hidden> writes:

> I implemented Ackermann's function A(m,n), a recursive function with
> scary performance characteristics, based on the definition given at
> wikipedia involving lots of +1 and -1 operations...

Guile's VM does not yet have opcodes for +1 and and -1. It probably
should, though. Still, this is a fairly artificial test.

> I wrote both elisp and scheme versions; I changed the recursion to an
> explicitly-managed stack of values, to avoid some of the dynamic-vs- 
> lexical argument binding issues, turn the native stack usage into large
> numbers of cell allocations and discards, and avoid exceeding  the
> maximum Lisp evaluation depth Emacs permits.

OK now we know we are entering the realm of the artificial benchmark :-)

> Not surprisingly,  it's
> not too fast, and Emacs spends a lot of time in garbage  collection
> passes even with the byte-compiled version.  My time-test  function here
> counts seconds of real time (on an 8-core Mac without a  lot of other
> stuff going on, so CPU usage is probably very close):
>
> ELISP> (time-test '(ackermann 3 1))
> 1.6e-05
> ELISP> (time-test '(ackermann 3 6))
> 0.036018
> ELISP> (time-test '(ackermann 3 7))
> 0.16574899999999998
> ELISP> (time-test '(ackermann 3 8))
> 0.6457170000000001
> ELISP> (time-test '(ackermann 3 9))
> 2.482042
> ELISP> (time-test '(ackermann 4 0))
> 1.4e-05
> ELISP> (time-test '(ackermann 4 1))
> 642.746514

OK

> With the installed guile-1.8.7 binary and the interpreted Scheme version
> of the code, performance was worse:
>
> % time guile -l ack.scm -c '(ackermann 3 1)'
> 0.009u 0.003s 0:00.01 0.0%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 6)'
> 0.409u 0.078s 0:00.52 90.3%   0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 9)'
> 25.300u 4.540s 0:29.85 99.9%  0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 4 1)'
> 6016.150u 1122.109s 1:58:59.32 99.9%  0+0k 8+4io 149pf+0w

So, you're also timing Guile startup here, which takes about 20 ms.

> Yes, that's just shy of two hours for the last one, where Emacs took
> less than 11 minutes.

Of course, that's comparing compiled code to interpreted code, but that
is a pretty bad result, yes.

> I built and installed 1.9.1, so I could test with more recent code:
>
> % time guile -l ack.scm -c '(ackermann 3 1)'
> 0.026u 0.006s 0:00.03 66.6%   0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 6)'
> 0.761u 0.141s 0:00.90 100.0%  0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 9)'
> 47.242u 8.644s 0:55.89 99.9%  0+0k 0+0io 0pf+0w
> %
>
> Since this behaved worse than 1.8.6, I skipped the 4,1 test.

Yes, Guile 1.9's interpreter will be slower than 1.8's.

> I tested
> the compiled version:
>
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 1))'
> 0.012u 0.004s 0:00.01 100.0%  0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 6))'
> 0.772u 0.333s 0:01.10 100.0%  0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 9))'
> 48.728u 22.016s 1:10.75 99.9% 0+0k 0+0io 0pf+0w
> %
>
> Not much better than loading the .scm file, and only better for small
> values of n... in fact, with m=3 and n>=6 the performance seems worse
> in the precompiled case:
>
> % time guile -l ack.scm -c '(ackermann 3 10)'
> 181.997u 34.928s 3:36.94 99.9%        0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 10))'
> 196.678u 85.618s 4:42.34 99.9%        0+0k 0+0io 0pf+0w
> %
>
> Poking at it a couple of times with Apple's process sampling tool, it
> does seem to be spending a bit of time in garbage collection -- but
> scm_leave_guile (called from scm_async_click via
> scm_pthread_mutex_lock) and getrusage (called from
> scm_c_get_internal_run_time via mytime) also seem to show up on the
> stack a significant fraction of the time, and each of those appears to
> be a comparable if not bigger time sink.

This could be simply that Mac OS has different performance
characteristics than Linux. If locking uncontended mutexen or getting
the resource usage of a process are expensive, that doesn't sound very
good...

> According to the execution timing with the compiled code, GC actually
> doesn't seem to be a huge fraction of the run time, maybe 8-9%:
>
> scheme@(guile-user)> ,t (ackermann 3 8)
> 2045
> clock utime stime cutime cstime gctime
> 18.43 12.96  5.47   0.00   0.00   1.66
> scheme@(guile-user)> ,t (ackermann 3 9)
> 4093
> clock utime stime cutime cstime gctime
> 73.47 51.61 21.85   0.00   0.00   5.95
> scheme@(guile-user)>
>
> The latter is the case that took under three seconds in Emacs,
> remember.

Perhaps the deal is that 1+ is not being inlined. It seems to be calling
out to the 1+ procedure, which is a primitive, actually making it a
bit slower. I'll see what I can do.

> In retrospect, this is much heavier on the interpretation or execution
> of some of the simple operations than I intended, and Emacs benefits a
> lot from opcodes like "add1", and I'm more interested specifically in
> exercising allocation and GC at the moment, so it's not that useful a
> test for me after all.  But I thought I'd send this along anyways in
> case someone was interested.  In particular the fact that the
> performance of the precompiled version gets worse as n grows might be
> something to investigate, unless the cause is already known....

Thank you for the report, I'll investigate.

Andy
-- 
http://wingolog.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]