guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings, symbols, vectors, etc.


From: Michael Livshin
Subject: Re: strings, symbols, vectors, etc.
Date: 28 Sep 2000 15:38:40 +0200
User-agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (20 Minutes to Nikko)

Dirk Herrmann <address@hidden> writes:

> I'm trying to separate handling of the different types in guile that
> currently share the SCM_LENGTH macro.  There are a couple of questions
> with regards to this:
> 
> * The different types that use SCM_LENGTH up to now are strings, symbols,
>   normal scheme vectors, weak scheme vectors (including the weak hash
>   tables), continuations, all unified vectors (including bitvectors), and
>   maybe others.
> 
>   Which of these types should be handled separately by giving them their
>   own means to determine their length?  In other words, for which of these
>   types is it likely that we might want to change the type layout in a
>   way that SCM_LENGTH would have to be changed?
> 
>   My suggestion is, at least to define a SCM_<type>_LENGTH macro for
>   symbols, strings, continuations and everything else that is not a
>   vector.  Vectors and weak vectors could use SCM_VECTOR_LENGTH.  All the
>   uniform vector types could use SCM_UVECTOR_LENGTH.  However, I think
>   that it might make sense to also treat bitvectors differently, since
>   their handling varies from the other uniform vector types.

most likely, the layout will change for strings/symbols (but you know
more than me here) and for uniform vectors.

continuations, compiled clusures etc will probably also be represented
differently when GOOPS is merged (hmmm, we seem to have an emerging
candidate for the title of the son of Godot here.  oh well, at least
now it's partially my own fault), so I'm not sure whether looking at
them as sequences will be useful at all.

> * The function scm_vector_set_length_x is called from a couple of places
>   for different types, for example strings and vectors.  I'd like to have
>   this function really work on vectors only.  A corresponding
>   scm_string_set_length_x might be provided, but it does not seem to be
>   necessary, since it seems to me that in most places where
>   scm_vector_set_length_x is called for strings, creating a fresh
>   sub/superstring with copying should work fine.

wasn't scm_vector_set_length_x disabled or something?  it's certainly
not available from Scheme.  I think it's good, and we shouldn't
introduce more things like that unless absolutely necessary.

> * There are a lot of functions where there is no clean definition about
>   which input type (string/substring/symbol) should be accepted.  Among
>   these are:  scm_dirname, scm_basename, gh_scm2newstr, gh_get_substr,
>   scm_putenv, scm_regexp_exec, setzone, scm_string_to_symbol (yes!).
> 
>   The unclarity comes in these cases from the fact that the attribute
>   macro SCM_ROSTRINGP is used for type checks.  However, symbols are also
>   accepted from SCM_ROSTRINGP.
> 
>   The corresponding comments typically speak of input _strings_.  My
>   assumption is, that these functions are intended to work on strings, and
>   that it is an unknown/unwanted implementation detail of SCM_ROSTRINGP to
>   also accept symbols.  Whatever the reason may be, I suggest to make
>   these functions accept strings only.

agreed!

> * The code in ramap.[ch] and unif.[ch] merges a couple of concepts that
>   might better be handled separately.  For example, the attempt to treat
>   bitvectors in common with the other uniform vector types complicates the
>   code a lot.  Further, strings, vectors and weak vectors are also handled
>   in many operations together with the uniform vector types.
> 
>   It would mean a great simplification to factor out the handling of
>   bitvectors from unif.c, and also not to handle strings as vectors any
>   more.  Maybe for those functions that explicitly use uniform vectors it
>   would also make sense to drop the support for vectors and weak
>   vectors.

this is an interesting issue.

* regular vectors.

  I think they are properly uniform vectors -- they are
  randomly-addressable sequences.  ditto for weak vectors.

* bit vectors.

  bit vectors certainly are randomly-addressable sequences, and in
  light of this it seems (to me at least) natural to consider them a
  kind of uniform vectors.

  the current switch-heavy implementation of generic uniform vector
  operations does look ugly, though.  I think we will be able to do
  better when the son of Godot comes.

* strings (and symbols).

  strings are not necessarily randomly-addressable sequences.  if/when
  the plans for internatiolization of Guile string handling are
  carried through, Guile strings will use a multibyte encoding, so
  vector-like set/ref operations won't be very practical on them.  so
  I think they should *not* be handled they way uniform vectors are.
  there is also no need for that, as a separate byte vector type
  exists (does it?).

-- 
All ITS machines now have hardware for a new machine instruction --
SPO
Skip if Power Off.
Please update your programs.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]