guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile and emacs: unexec


From: Ken Raeburn
Subject: Re: guile and emacs: unexec
Date: Sun, 14 Jun 2009 01:21:37 -0400

On Jun 13, 2009, at 09:06, Andy Wingo wrote:
Hi Ken,

On Fri 12 Jun 2009 07:02, Ken Raeburn <address@hidden> writes:

I'm glad to see the emacs-lisp work is progressing.  As it happens, a
month or so ago I blew some of the dust off my old guile-emacs project
and started working on it again too.  This flavor of emacs+guile work
aimed to replace Lisp objects in Emacs with Guile objects at the lowest
level (numbers, cons cells; symbols and such become smobs) and  then
work upwards from there.

Very interesting! To be clear -- the goal would be to represent as much
of Emacs using cheap Guile structures as possible: numbers and cons
cells and such, and represent specific Emacs objects as smobs? That's
probably a good idea.

Yes -- for now, that includes anything I haven't converted, including strings, symbols, vectors of objects, hash tables, etc. Many of what are currently smobs should eventually be converted to using Guile's versions, either directly or with some simple wrapping. They need to stay identifiable in Lisp as the correct object types, so I can't implement a Lisp string with, say, a Guile list containing a string plus text-property data. In the long term perhaps some of them could be implemented partly or fully in Scheme, but I don't want to diverge radically while I still need to track the main Emacs code base. (Please let me keep the illusion that replacing the fundamental object representation, allocator and garbage collector, and compensating for initialization problems throughout the code, isn't all that radical a change. :-)

But I'm not worrying about that much right now -- if the representation abstraction is complete and correct, the existing Emacs code should be able to pull out all the right data from the smobs, and results should be indistinguishable. Well, except that integer and floating ranges may be different, hash table ordering changes -- simple, reasonable, and well-understood differences. It's not quite there yet.

I figure, once I've got this set of changes working correctly (i.e., nearly indistinguishable, no random unexplained errors or differences in behavior), then I can tackle the next steps with more confidence that differences observed there are due to the new changes in progress, not semantic differences previously introduced with further- reaching effects than I expected.

It's also kind of appealing to have something at intermediate stages that I might be able to show off, and say "hey, this works well enough that you can try it out; want to help me on the next steps?" (And since I'm getting into all this now, I *would* like some help. I was just intending to fix a few more problems before making the plea. :-)

I'm specifically *not* trying to do some of the other things that have been discussed but aren't about running Emacs -- make buffers independent objects that can be used outside of Emacs, stuff like that. That can come later (or not), and I'd be glad to see it happen, but getting Emacs running at all is a big enough project for me on my own.

Symbols however should probably be represented as Guile symbols, not
smobs. I think that you will find that with a more compilation-centric
approach, we will be able to keep more simple datatypes, as we compile
the procedures that operate on those data types to appropriate code.

Eventually, yes, I think so. They should probably be one of the next things to change, though some like vectors and strings might be simpler. I'm also concerned about the performance impact of making such a switch; another reason for getting something working soon is so it's practical to look at performance questions.

I've updated to recent Emacs sources and Guile 1.8.6. I've gotten it
to a point where it seems to start up fine in tty mode, reads in (and
does color highlighting of) C files and directories, does some other
basic stuff. I'm tweaking it now to see if I can get more stuff
working (like Cocoa support and "make bootstrap") and do more
extensive testing.

Very neat! That's fantastic that you were able to get it this far, I
didn't know that was possible.

I actually had it pretty far along once or twice before (I seem to keep reviving this every few years, and spend a lot of time updating to newer code bases), but I think I've managed to push it a bit further than I had it earlier. With just me working on it, depending on the demands of my job, there tend to be large periods when no progress gets made, and it doesn't keep up with the upstream sources; the prospect of having to do a bunch of catch-up work just makes it that much less appealing to get back into it. It's been moving forward in spurts for over a decade now, very slowly. :-(

If this is an effort that you want to pay off in the future, though, I
would strongly suggest updating to the 1.9/2.0 series of Guile. The
expressive range of Guile's multilingual facilities is much higher
there, and significantly different from 1.8.

I was looking at updating, but ran into the -I ordering problem I reported. Since that's fixed, I'll try again sometime.

The multilingual facilities aren't very important to me right now -- like I said above, I'm mostly just switching some object representations now, and I'm still using the Emacs code for any multilingual stuff. Eventually that should change, but what I want of Guile right now is a nice, simple byte array I can stick string data into. :-) Emacs 23 is going to go out with the Emacs version of the support, and yanking out anything made available to Lisp programmers isn't going to go over very well. Of course, it wouldn't be very good to wind up with duplicated work, or redundant or conflicting interfaces, either.

OTOH, the emacs lisp support is not yet up to the level that it is at in
1.8, so perhaps now is not yet the time.

And, I haven't started using any of that code yet, either... that's another big change to try at some point when everything else is looking solid. And, I assume it expects the use of Guile symbols and Guile strings at least? In order to make this switch, too, the semantics really have to match Emacs Lisp -- stuff like indirect symbols, buffer- or frame-local bindings, etc. And all the Emacs C code needs to know how to look up values (or function values, or property lists, or whatever) when given Guile symbols. And then there's the lexical binding branch work, which I haven't even looked at yet.

One really big hiccup I've run into, which I've sort of sidestepped for
the moment: Guile is not unexec-friendly.

There is a way to build Emacs so it doesn't use unexec, but it then has
to load a lot of Lisp code at run time, really killing the startup
performance, and I don't think it's tested all that much (e.g., "make
bootstrap" doesn't work even without the Guile hacks). To really make
this project work, I need to be able to link against Guile (static is
fine, and probably necessary), do a bunch of Lisp/Scheme processing,
write out a memory image into a new executable, and later be able to
run that executable.

It's true that Guile doesn't do unexec currently. It might in the
future -- obviously it will if you implement it of course ;)

But I would ask that you reconsider your approach to making Guile- Emacs load quickly. There is no a priori reason that loading Lisp code should
be slow. With Guile-compiled elisp, loading a file is just mapping it
into memory -- the same as you have with an image. The loaded code needs to be run to establish definitions, but that is a very quick operation.

I don't think the current Lisp reader is all that slow, but it has to load and run quite a bit of stuff, especially with the internationalization support. Especially during a "bootstrap" operation, when most of the stuff it loads is uncompiled Lisp source code.

It seems to me that switching to Guile-compiled elisp for startup would require, well, basically most of the remaining work of my project, including switching to the Guile-based Lisp reader and evaluator, wouldn't it? So we're looking at some non-trivial changes here. They're desirable changes, in the long run, but taking this route would mean no efficient startup of guile-emacs any time soon, which in turn slows down the development cycle. The unexec support may be useless once we get there, but right now it's a much shorter path to something useable I can show off.

(Fixing up the "interactive scheme mode" that talks to Guile directly would be nice to show off, too. My current one is kind of a lame hack.)

I agree that heap saving could be slightly faster. But I think that
Emacs should be able to load from bytecode within 100 ms or so /with the current Guile-VM code/ -- and even faster if we do native ahead-of- time
compilation at some point.

I'd certainly like to get there eventually.

Really, it comes down to wanting something I can make work now, instead of a project with minimal, uninteresting intermediate results that may or may not pay off in another decade or so, and doesn't get anyone else interested in helping out. With the current state of Emacs, that means unexec is kind of needed. It can sort of work without it, but not well -- and that's true of the upstream Emacs code base too, but no one on that side cares very much because unexec works for them everywhere.

I've got some political concerns here too. There has also been some resistance, when this project has come up on the Emacs lists, to switching away from the current Lisp evaluator for any reason, even if Guile support is added (it's not broken, major changes involve significant risk, don't see the benefit, etc); there's also been support, but it's contentious. So my rather vague plan has involved putting off even addressing that possible switch until I can show clear advantages and no blatant drawbacks (like performance, or correctness, or handling of out-of-memory conditions) to using Guile. I'd rather not discuss it from a position of weakness and uncertainty; better to have working code we can experiment with and numbers we can point to. (But first, let's experiment and generate numbers ourselves, and see if we need to fix bugs.) Then we can discuss our options.

I don't know how much chance there would be for getting it ready in time for Emacs 24, but with enough help, I think Emacs 25 should be doable; possibly even 24, who knows?

Any record of current threads needs to go away, and be replaced with
info on the new one-and-only thread in the new process; I'm building
without thread support for now to get around it.  Any record of stack
regions to be scanned for SCM objects likewise needs resetting.
Allocated objects must *not* go away, and must continue to be processed
by the garbage collector, so I can't just reinitialize  everything.
Assigned smob types must remain in effect, and for now I'm ignoring the
possibility that some smobs may need some kind of  reinitialization.
Mutexes... well, I don't know if they need  reinitializing; POSIX is
kind of unclear on interactions with  unexec. :-)  I expect
reinitializing them is probably safe, even if  not required in some
implementations.

This could be complicated if we merge in the BDW-GC branch, to use
libgc. Note that SCM does have unexec, IIRC, we could steal parts of
their implementation

That might work, yes... or if not, it sounds like I'd be stuck with using an old Guile, or getting the CANNOT_DUMP option working and suffering with the slow startup.

(And, this reminds me -- there are still some likely GC-related bugs with scm_leave_guile/scm_enter_guile that should be fixed up. I got them removed from the API years ago, but they're still used internally in threads.c, down below the comment with my old email explaining the doom they may bring upon us. Does BDW-GC scrap that code finally? Please?)

Is this something that could be useful to anyone outside of Emacs?

Unexec certainly could, to deliver self-contained binaries. But TBH I
think the booting-from-compiled-files option is more maintainable. In
any case this would be a neat hack. Have fun! :)

I agree, compiled files would work better, but I doubt we can push the Emacs folks to move in that direction first. They're happy with unexec for now.

P.S.  If anyone wants to take a look at my current work,
http://www.mit.edu/~raeburn/guilemacs/guile-emacs.tar.bz2
has a snapshot from tonight.

Cool! Have you considered using git, and branching from Emacs' git
mirror? That way it is trivial to set up something other people can
comment on, in easily-digestible patch chunks.

Yep, but I need to get proficient with it first, and haven't put in the time yet; until then I'm using subversion in a rather clumsy fashion (often just checkpointing untested merges, and my Emacs sources have the CVS admin files checked in so I can update easily). If it's something other people want to actually work on, on the other hand, we could set up something via sourceforge or savannah or whatever. But only if there's actually going to be additional help coming....

Ken




reply via email to

[Prev in Thread] Current Thread [Next in Thread]