guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ffi docs


From: Neil Jerram
Subject: Re: ffi docs
Date: Thu, 15 Apr 2010 23:36:27 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)

Andy Wingo <address@hidden> writes:

> Hi,

Hi Andy,

I agree with Ludo that this work is really great.  I had some thoughts
when reading through, as far as the start of the Foreign Structs
section, as follows.  I'll try to comment on the rest tomorrow.

         Neil


>    But yet we as programmers live in both worlds, and Guile itself is
> half implemented in C. So it is that Guile's living half pays respect
> to its dead counterpart, via a spectrum of interfaces to C ranging from
> dynamic loading of Scheme primitives to dynamic binding of stock C
> library prodedures.

c -----------^

>    We titled this section "foreign libraries" because although the name
> "foreign" doesn't leak into the API, the world of C really is foreign
> to Scheme - and that estrangement extends to components of foreign
> libraries as well, as we see in future sections.

I'm not sure what the message is here.

>  -- Scheme Procedure: dynamic-link [library]
>  -- C Function: scm_dynamic_link (library)

Code below implies that library can be omitted, and that this -
i.e. '(dynamic-link)' - means to return an object representing libguile
itself.  Should that be mentioned in the following doc?

>      Find the shared library denoted by LIBRARY (a string) and link it
>      into the running Guile application.  When everything works out,
>      return a Scheme object suitable for representing the linked object
>      file.  Otherwise an error is thrown.  How object files are
>      searched is system dependent.
>
>      Normally, LIBRARY is just the name of some shared library file
>      that will be searched for in the places where shared libraries
>      usually reside, such as in `/usr/lib' and `/usr/local/lib'.
>
>      When LIBRARY is omitted, a "global symbol handle" is returned.
>      This handle provides access to the symbols available to the
>      program at run-time, including those exported by the program
>      itself and the shared libraries already loaded.

>    Given some set of C extensions to Guile, the next logical step is to
> integrate these glue libraries into the module system of Guile so that
> you can load new primitives into a running system just as you can load
> new Scheme code.
>
>  -- Scheme Procedure: load-extension lib init
>  -- C Function: scm_load_extension (lib, init)
>      Load and initialize the extension designated by LIB and INIT.
>      When there is no pre-registered function for LIB/INIT, this is
>      equivalent to
>
>           (dynamic-call INIT (dynamic-link LIB))
>
>      When there is a pre-registered function, that function is called
>      instead.
>
>      Normally, there is no pre-registered function.  This option exists
>      only for situations where dynamic linking is unavailable or
>      unwanted.  In that case, you would statically link your program
>      with the desired library, and register its init function right
>      after Guile has been initialized.

Should there be a reference from here to wherever the registration API
is covered?

>      LIB should be a string denoting a shared library without any file
>      type suffix such as ".so".  The suffix is provided automatically.
>      It should also not contain any directory components.  Libraries
>      that implement Guile Extensions should be put into the normal
>      locations for shared libraries.  We recommend to use the naming
>      convention libguile-bla-blum for a extension related to a module
>      `(bla blum)'.

I believe this will shortly be out of date, won't it? - given our desire
to support parallel installs.

>    A compiled module should have a specially named "module init
> function".  Guile knows about this special name and will call that
> function automatically after having linked in the shared library.  For
> our example, we replace `init_math_bessel' with the following code in
> `bessel.c':
>
>      void
>      init_math_bessel (void *unused)
>      {
>        scm_c_define_gsubr ("j0", 1, 0, 0, j0_wrapper);
>        scm_c_export ("j0", NULL);
>      }
>
>      void
>      scm_init_math_bessel_module ()
>      {
>        scm_c_define_module ("math bessel", init_math_bessel, NULL);
>      }
>
>    The general pattern for the name of a module init function is:
> `scm_init_', followed by the name of the module where the individual
> hierarchical components are concatenated with underscores, followed by
> `_module'.

Is this still correct?  IIUC it only makes sense as part of the ability
we once had for a (use-modules (...)) call to find a .so and bootstrap
it automatically.  (Unless that has been reinstated...)

>    Presently there's no convention for having a Guile version number in
> module C code filenames or directories.  This is primarily because
> there's no established principles for two versions of Guile to be
> installed under the same prefix (eg. two both under `/usr').  Assuming
> upward compatibility is maintained then this should be unnecessary, and
> if compatibility is not maintained then it's highly likely a package
> will need to be revisited anyway.
>
>    The present suggestion is that modules should assume when they're
> installed under a particular `prefix' that there's a single version of
> Guile there, and the `guile-config' at build time has the necessary
> information about it.  C code or Scheme code might adapt itself
> accordingly (allowing for features not available in an older version
> for instance).

I guess this also needs updating, for the new parallel install vision.

> 0.1.5 Foreign Pointers
> ----------------------
>
> The previous sections have shown how Guile can be extended at runtime by
> loading compiled C extensions. This approach is all well and good, but
> wouldn't it be nice if we didn't have to write any C at all? This
> section takes up the problem of accessing C values from Scheme, and the
> next discusses C functions.
>
> 0.1.5.1 Foreign Types
> .....................
>
> The first impedance mismatch that one sees between C and Scheme is that
> in C, the storage locations (variables) are typed, but in Scheme types
> are associated with values, not variables. *Note Values and Variables::.

Fine, but...

>    So when accessing a C value through a Scheme pointer, we must give
> the type of the pointed-to value explicitly, as a parameter to any
> Scheme procedure that accesses the value.

This confused me at first.  I think I understand the point now, but

- isn't it actually much more to do with the ELF binary format, rather
  than with C?  If libguile could read and parse C, it would be able to
  infer the type of any variable that the Scheme layer might request.
  The problem is precisely that what we are linking with is *not* C
  anymore...  It's just untyped pointers.

- I think "give the type ... as a parameter to any Scheme procedure that
  accesses the value" is misleading, because we don't do that!  Rather,
  we construct a box that includes both the pointer and the type, and
  then pass the box around.

> 0.1.5.2 Foreign Variables
> .........................
>
> Given the types defined in the previous section, C pointers may be
> looked up dynamically using `dynamic-pointer'.
>
>  -- Scheme Procedure: dynamic-pointer name type dobj [len]
>  -- C Function: scm_dynamic_pointer (name, type, dobj, len)
>      Return a "handle" for the pointer NAME in the shared object
>      referred to by DOBJ. The handle aliases a C value, and is declared
>      to be of type TYPE. Valid types are defined in the `(system
>      foreign)' module.
>
>      This facility works by asking the dynamic linker for the address
>      of a symbol, then assuming that it aliases a value of a given
>      type. Obviously, the user must be very careful to ensure that the
>      value actually is of the declared type, or bad things will happen.
>
>      Regardless whether your C compiler prepends an underscore `_' to
>      the global names in a program, you should *not* include this
>      underscore in NAME since it will be added automatically when
>      necessary.
>
>    For example, currently Guile has a variable, `scm_numptob', as part
> of its API. It is declared as a C `long'. So, to create a handle
> pointing to that foreign value, we do:
>
>      (use-modules (system foreign))
>      (define numptob (dynamic-pointer "scm_numptob" long (dynamic-link)))
>      numptob
>      => #<foreign int32 8>
>
>    A value returned by `dynamic-pointer' is a Scheme wrapper for a C
> pointer, with additional type information. A foreign pointer prints
> according to its type. This example showed that a `long' on this
> platform is an `int32', and that the value pointed to by `numptob' is 8.

I think the terminology is confusing here in two ways.

1. The API and the doc call these objects pointers, but because of the
automatic dereference they don't behave like pointers at all.  (Their
print function prints *p, not p, and foreign-set! does *p = val, not p =
val.)

I think that "reference" might be a less surprising name - as in C++
references, and "call by reference".

2. An object created by '(dynamic-pointer ...)' prints as '#<foreign
...>'.  If you think that foreign is the best word for this whole
area (and I think it's fine), I think you should bite the bullet and
make all the APIs say 'foreign' instead of 'dynamic'.  (And obviously
keep the 'dynamic' names of 1.8.x APIs as aliases.)

> 0.1.5.3 Void Pointers and Byte Access
> .....................................
>
> As a special case, a dynamic pointer may be declared to point to type
> `void', in which case it is treated as a void pointer. A void pointer
> prints its value as a pointer, without dereferencing the pointer.
>
>    It's important at this point to conceptually separate foreign values
> from foreign pointers. `dynamic-pointer' gives you a foreign pointer. A
> foreign value is the semantic meaning of the bytes pointed to by a
> pointer. Only foreign pointers may be wrapped in Scheme. One may make a
> pointer to a foreign value, and wrap that as a Scheme object, but a
> bare foreign value may not be wrapped.

I'm not getting the distinction here at all.  Is it important for what
follows?

>    When you call `dynamic-pointer', the TYPE argument indicates the
> type to which the given symbol points, but sometimes you don't know
> that type. Sometimes you have a pointer, and you don't know what kind of
> object it references. It's simply a pointer out into the ether, into the
> `void'.
>
>    Guile can wrap such a pointer, by declaring that it points to `void'.
>
>  -- Scheme Variable: void
>      A foreign type value representing nothing.
>
>      `void' has two uses: for a foreign pointer, declaring it to be of
>      type `void' is like having a `void*' in C. For a function, a
>      return type of `void' indicates that the function returns no
>      values. A function argument type of `void' is invalid.

This is fine.

>    As an example, `(dynamic-pointer "foo" void bar-lib)' links in the
> FOO symbol in the BAR-LIB library as a pointer to `void': a `void*'.
>
>    Void pointers may be accessed as bytevectors.
>
>  -- Scheme Procedure: foreign->bytevector foreign [uvec_type [offset
>           [len]]]
>  -- C Function: scm_foreign_to_bytevector foreign uvec_type offset len
>      Return a bytevector aliasing the memory pointed to by FOREIGN.
>
>      FOREIGN must be a void pointer, a foreign whose type is VOID. By
>      default, the resulting bytevector will alias all of the memory
>      pointed to by FOREIGN, from beginning to end, treated as a `vu8'
>      array.

It feels like we're missing a unification trick here.

Thought #1: if we have, e.g., an int8 pointer ip, why not just use
(foreign-ref ip n) to interpret the pointer as pointing to an array, and
get its nth element?

Thought #2: but if we do that we'll be duplicating the bytevector API.
So instead, shouldn't the fundamental operation be (foreign->bytevector
NAME TYPE LIBRARY [LEN]), and get/set then done using the bytevector
API?

I'm not sure either of those thoughts is right, but the current API
doesn't feel as elegant as I think it could be.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]