[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gcl-devel] README.macosx
From: |
Aurelien Chanudet |
Subject: |
[Gcl-devel] README.macosx |
Date: |
Sun, 18 Jul 2004 18:37:19 +0200 |
Hi all,
Here is a preliminary README.macosx file. I'm planning to complete this
file during the summer. Comments much appreciated !
Aurelein
Mac OS X implementation notes
aurelien.chanudet <at> m4x.org
July 18, 2004
This file briefly discusses Mac OS X implementation notes.
* Third party malloc(3) calls
In GCL, for the sake of efficient memory management, there should be no
calls to standard memory
allocation routines such as malloc(3) or free(3). Instead, GCL's own
memory allocation routines
should be used. In particular, this means that third party calls to
malloc(3) (remember that GCL
uses, say, gmp which is likely to call malloc(3)) should be intercepted
and re-routed to GCL's
own functions. On Linux, this is done at build time, during the linking
stage. This is possible
because symbols in Linux live in a flat namespace. By contrast, symbols
in Mac OS X live in a
two-level namespace. This means that it is not possible, on Mac OS X,
to override the default
implementation of malloc(3) as provided by libc. The trick on Mac OS X
is to use Darwin's zone
mechanism. Darwin has a poorly documented API allowing advanced memory
management (see
<objc/malloc.h>). Most applications have only one zone, also called the
default zone, which is
automatically created the first time the program calls malloc(3). In
GCL, an extra zone is
created at initialization time and is then made to be the default zone.
* Broken sbrk(2) replacement strategy
sbrk(2) simply does not work on Mac OS X. Unfortunately, GCL heavily
relies on it. Indeed,
GCL has its own page level memory management scheme. Regular Mach-O
applications have at least
three segments : the text segment (__TEXT), the data segment (__DATA)
and the link-edit segment
(__LINKEDIT). The first two segments have equivalent ELF segments (the
third segment contains
link-edit information such as references to imported symbols). When the
process is bootstrapped
in memory by the dynamic loader, these segments are mapped in memory
and any subsequent memory
allocation takes place after the end of the link-edit segment. This
layout turns out to be
problematic because GCL assumes that memory allocation takes place
immediately after the end of the
data segment. That is, I suspect that, on Linux, calling sbrk(2)
results in extending the size
of this data segment ; however, on Mac OS X, the size of the data
segment cannot vary at runtime.
For this reason, an extra data segment is created at initialization
time and is inserted between
the first data segment and the link-edit segment resulting in the
following memory layout :
+---------------------------------------------------------------------+
| __TEXT segment as created by gcc |
+---------------------------------------------------------------------+
<- DBEGIN
| __DATA segment as created by gcc (size is fixed) |
+---------------------------------------------------------------------+
<- mach_mapstart
| |
<- heap_end
| |
<- core_end
| |
<- mach_brkpt (= my_sbrk(0))
+---------------------------------------------------------------------+
<- mach_maplimit (= DBEGIN + MAXPAGE)
| __LINKEDIT segment created by gcc but moved toward higher addresses |
+---------------------------------------------------------------------+
The heap ranges from DBEGIN to DBEGIN+MAXPAGE. The area of memory
between mach_mapstart
and mach_maplimit is our extra data segment. To bridge the gap with our
first section (third
party malloc(3) calls), we can say that all memory allocation happens
between mach_mapstart
and mach_brkpt. In particular, this means that the area ranging from
mach_brkpt to mach_maplimit
is mere wiggle room (memory is set to zero). You might wonder how an
extra data segment can
be programmatically inserted once the process is mapped in memory : the
trick is to write
a modified executable file (containing sufficient information for the
dynamic loader to know
how to set up this memory layout) and then use execv(2).
* Unexec()'ing
Unexec()'ing is the process of capturing the memory footprint of a
running process and storing
it to an executable file for later re-execution. Fortunately, not the
whole address space
has to be saved to disk. Indeed, because the virtual address space is
sparse, only non zero
ranges have to be saved. In particular, this means that the region
ranging from mach_brkpt
to mach_maplimit isn't saved to the file (thus resulting in a segment
whose filesize is
less than its virtual memory size, see Mach-O Runtime Architecture for
details). The bulk of
the work comes from Andrew Choi's work for Emacs.
* BFD Mach-O port
GCL has the ability to compile Lisp code to native object code, load
the compiled code into the
running image, link the code and execute it. Most of the time, this
kind of functionality is
achieved by compiling shared object libraries (.so files on Linux,
.dylib files or .bundle files on
Mac OS X) and then loading the shared object library using the
dlopen(3) interface (note, however,
that Mac OS X has its own replacement solution for dlopen(3), which is
the dyld(3) interface, but
the concept remains the same). Handy as the dlopen strategy is, it is a
fairly time consuming one
because the external linker has to be called in order to turn raw
object files (.o) as output by
the compiler into shared object files. To speed things up at the
expense of more development, GCL
implements its own toy linker on top of the BFD interface. BFD is the
Binary File Descriptor library.
The official BFD distribution supports ELF well, but does not really
support Mach-O. For this reason,
I had to extend the official Mach-O back-end in order to support
linking (see files "mach-o.c" and
"mach-o-reloc.c" in the local BFD tree). There's nothing special to say
about this port, excepted
that the job of adding relocation support was tough. To summarize at
this stage, there are two
dynamic loading mechanism available for Mac OS X. The old one, slow and
no longer supported relies
on the dyld(3). And the new one, built on top of my work to extend the
Mach-O back-end. At this
stage, there are no known bugs in the relocation code (that is, Maxima
and ACL2 compile fine),
but there's a high probability that as yet hidden bugs remain.
* Stratified Garbage Collection
To be completed.
* Further references :
- Mach-O Runtime Architecture
- Linkers & Loaders by John R. Levine
- [Gcl-devel] README.macosx,
Aurelien Chanudet <=