[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
strings, symbols, vectors, etc.
From: |
Dirk Herrmann |
Subject: |
strings, symbols, vectors, etc. |
Date: |
Thu, 28 Sep 2000 15:07:35 +0200 (MEST) |
Hello!
I'm trying to separate handling of the different types in guile that
currently share the SCM_LENGTH macro. There are a couple of questions
with regards to this:
* The different types that use SCM_LENGTH up to now are strings, symbols,
normal scheme vectors, weak scheme vectors (including the weak hash
tables), continuations, all unified vectors (including bitvectors), and
maybe others.
Which of these types should be handled separately by giving them their
own means to determine their length? In other words, for which of these
types is it likely that we might want to change the type layout in a
way that SCM_LENGTH would have to be changed?
My suggestion is, at least to define a SCM_<type>_LENGTH macro for
symbols, strings, continuations and everything else that is not a
vector. Vectors and weak vectors could use SCM_VECTOR_LENGTH. All the
uniform vector types could use SCM_UVECTOR_LENGTH. However, I think
that it might make sense to also treat bitvectors differently, since
their handling varies from the other uniform vector types.
* The function scm_vector_set_length_x is called from a couple of places
for different types, for example strings and vectors. I'd like to have
this function really work on vectors only. A corresponding
scm_string_set_length_x might be provided, but it does not seem to be
necessary, since it seems to me that in most places where
scm_vector_set_length_x is called for strings, creating a fresh
sub/superstring with copying should work fine.
* There are a lot of functions where there is no clean definition about
which input type (string/substring/symbol) should be accepted. Among
these are: scm_dirname, scm_basename, gh_scm2newstr, gh_get_substr,
scm_putenv, scm_regexp_exec, setzone, scm_string_to_symbol (yes!).
The unclarity comes in these cases from the fact that the attribute
macro SCM_ROSTRINGP is used for type checks. However, symbols are also
accepted from SCM_ROSTRINGP.
The corresponding comments typically speak of input _strings_. My
assumption is, that these functions are intended to work on strings, and
that it is an unknown/unwanted implementation detail of SCM_ROSTRINGP to
also accept symbols. Whatever the reason may be, I suggest to make
these functions accept strings only.
* The code in ramap.[ch] and unif.[ch] merges a couple of concepts that
might better be handled separately. For example, the attempt to treat
bitvectors in common with the other uniform vector types complicates the
code a lot. Further, strings, vectors and weak vectors are also handled
in many operations together with the uniform vector types.
It would mean a great simplification to factor out the handling of
bitvectors from unif.c, and also not to handle strings as vectors any
more. Maybe for those functions that explicitly use uniform vectors it
would also make sense to drop the support for vectors and weak vectors.
Best regards
Dirk
- strings, symbols, vectors, etc.,
Dirk Herrmann <=