guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: about strings, symbols and chars.


From: Gary Houston
Subject: Re: about strings, symbols and chars.
Date: 29 Nov 2000 18:10:35 -0000

> From: Dirk Herrmann <address@hidden>
> Date: Wed, 29 Nov 2000 12:27:23 +0100 (MET)
> 
> On Tue, 28 Nov 2000, Dirk Herrmann wrote:
> 
> > I'd like to get rid of the SCM_STRING_UCHARS macro and clean up the
> > handling of characters and strings with respect to signedness.  In other
> > words, it should be clearly defined what kind of characters are to be
> > found in a scheme string object.
> 
> Surprisingly, changing SCM_STRING_CHARS to always return an unsigned char*
> doesn't seem to have any effect at all.  Not a single additional compiler
> warning message.  Hmmm.  Why is that?  As far as I know, it is not
> specified whether a char is a signed or an unsigned value.  Thus, a char*
> could potentially be a pointer to a signed char or an unsigned
> char.  Assigning these pointer types to each other should at least cause a
> compiler warning, shouldn't it?

In C++ the compiler typically refuses to convert between char * and
unsigned char * without a cast.  In C it doesn't usually matter what
kind of char * you use.  I think the only time it makes any difference
is when converting a char to an int.

If you have:

char a = -1;
unsigned char b = a;

a and b have the same bit pattern, they still represent the same
character.

Two approaches that Guile could take are:

1) For the macros that give pointers to chars in an SCM object: provide
char * and unsigned char * versions.  This is probably best for people
who have to deal with C++ and need the right pointer for whatever they
are going to do with the characters.  I suppose for completeness there
should be signed char * versions too, but I guess they wouldn't be
used very often.

For functions, do what the C standard library does and use char *.
People linking Guile into other programs are not likely to find this
surprising.

Care is needed when converting char to int.

2) Use unsigned chars throughout.  This may reduce char->int problems
but may be painful in C++ since you must cast pointers before you can
use strlen etc.  In this case there probably should be a
SCM_STRING_UCHARS macro and no SCM_STRING_CHARS, to avoid confusion.

It seems to me that 1) is more or less what we are doing now, and it isn't
clear whether switching to 2) would be a good thing.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]