chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] UTF-8 support


From: John Cowan
Subject: Re: [Chicken-users] UTF-8 support
Date: Thu, 13 Dec 2007 12:32:42 -0500
User-agent: Mutt/1.5.13 (2006-08-11)

Tobia Conforto scripsit:

> As far as I could grasp they modify the behaviour of scheme primitives,
> string functions and other eggs (such as regex), so that they operate on
> UTF-8 strings internally and are fully Unicode-aware.

This is by no means true.  If you load the utf8 egg into the interpreter,
it rebinds various R5RS and Chicken-specific procedures to versions which
treat all strings as UTF-8 encoded; if the strings are in fact not UTF-8,
the results will be wrong.  The same rule applies to compiled code that
references the utf8 egg.

However, compiled code that did not use the utf8 egg at compile time will
not be affected.  So if you load first the utf8 egg and then another egg
into the interpreter, and a procedure in the second egg is called with
a UTF-8 string, a call to string-length in that procedure will return a
byte length, not a character length.  The same is true if the same egg
is invoked from compiled code.  Only a few eggs have either optional or
mandatory support for UTF-8.

Character objects for the full Unicode range are always available, as they
are part of the Chicken core; so are \u escapes in the Chicken reader.

> Also, does anybody know whether and how is the 4th parameter of (regexp)
> to be used?

Since the regex egg is compiled and does not know whether you have the
utf8 egg loaded or not, the parameter tells it whether or not to use its
own UTF-8 support, which is inherited from the underlying PCRE library.

-- 
John Cowan    address@hidden    http://ccil.org/~cowan
The present impossibility of giving a scientific explanation is no proof
that there is no scientific explanation. The unexplained is not to be
identified with the unexplainable, and the strange and extraordinary
nature of a fact is not a justification for attributing it to powers
above nature.  --The Catholic Encyclopedia, s.v. "telepathy" (1913)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]