chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Re: [gambit-list] Help With Memory


From: Marc Feeley
Subject: Re: [Chicken-users] Re: [gambit-list] Help With Memory
Date: Sat, 27 Sep 2008 10:20:18 -0400

Hi Felix. I did not mean to drag you into this discussion. I know performance benchmarking is one of your buttons that is best left untouched!

All of this started with this message on the Gambit mailing list about the performance claim that call/cc in Chicken was "free" because of Cheney on the MTA and that Gambit used the same approach:

On 24-Sep-08, at 11:14 AM, Per Eckerdal wrote:

Chicken.  Cheney on the MTA gives you call/cc essentially
for free - it's just as fast as any other function call.

I was under the impression that Gambit also did this.. Am I wrong?

/Per

My response was that Gambit's continuations are based on a completely different approach which gives just as good performance, using the ctak and fibc benchmarks as simple evidence. A complete analysis of the two approaches would take a lot of effort, which is why I used these benchmarks as a quick-and-dirty way to evaluate the performance (it turns out that ctak is much better than fibc as a benchmark for call/cc because fibc does many other things than just call/cc, i.e. it measures other optimizations of the compiler).

Let me reiterate that I'm not trying to compare Gambit and Chicken as systems. If that was the case I would have much more to say and obviously would conclude that Gambit is better ;-)

Marc

On 27-Sep-08, at 9:03 AM, felix winkelmann wrote:

On Fri, Sep 26, 2008 at 5:32 PM, Marc Feeley <address@hidden> wrote:

The conclusion from my benchmarks is quite different. Chicken does not outperform Gambit on these benchmarks. There is so little other stuff happening than call/cc in these benchmarks that it would appear that the performance of call/cc in Chicken and Gambit is essentially the same (to
within a few percent).

Why not simply say: chicken and gambit are roughly in the same ballpark?

In the end, I have learned that nearly every performance assumption I made was wrong, and I'm a pretty experienced Scheme coder. Performing benchmarks
like this and trying to extract any kind of practical relevance from
the fact that
program X on implementation Y with optimization settings Z takes 2% longer than on implentation Q. Are you sure you have built both implementations with maximal performance settings? Have you measured how much runtime- performance the memory patterns in this particular benchmark have caused? How do you know how your system configuration and hardware setup influences the outcome? Have you used optimal optimization settings for all implementations for this benchmark? Have you analyzed the compiler output to look for opportunities to tweak those settings for this particular benchmark? Do you know enough about chicken's internals and compiler options to chose the optimal combination (you couldn't, just as I couldn't for Gambit). It's all just assumptions.

The very reason Scheme and Lisp have so little acceptance and are not more widespread is that its implementors are so obsessed with performance (for
hystorical raisins, of course), instead of making their implementation
easier to work
with, more practical and more useful.

Nevertheless I understand this obsession, its lots of fun, after all. :-)

So: CheneyOnTheMTA is an elegant concept that unifies fast first-class
continuations, fast allocation, generational GC and not-too- difficult FFI
in a relatively simple framework. Chicken's compiler is sufficient,
but there are
many opportunities to improve performance, some of which will be
addressed, but which aren't really that important. A real module system (soon to come!) and 400+ libraries is what will make users happy, not 5% better
performance.

I believe that CheneyOnTheMTA is more memory-efficient than other
Lisp-implementation techniques. I also believe that the CPS-output of
this scheme
is more C-compiler friendly and easier to compile on stock machines. I believe
that COTMTA (that's a nice abbreviation - I think I'll use that from
now on) makes
cross-module calls more efficient than trampoline-style, which is important for large code-bases that use separate compilation and dynamically loaded plugins. These are all assumption that may possibly be completely wrong.

Keep up the good work, Marc! Gambit is cool. But chicken is better. ;-)


cheers,
felix





reply via email to

[Prev in Thread] Current Thread [Next in Thread]