gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] utf8 and emacs text/string multibyte representation


From: Bruce-Robert Fenn Pocock
Subject: Re: [Gcl-devel] utf8 and emacs text/string multibyte representation
Date: Sun, 2 Nov 2014 10:31:00 -0500

I'd just like to +1 the "incompatible" point. I'd hate to see #'code-char mean only to coerce a number to a byte, rather than an actual character.

On Nov 1, 2014 4:43 PM, "Raymond Toy" <address@hidden> wrote:
>>>>> "Camm" == Camm Maguire <address@hidden> writes:

    Camm> Greetings, and thanks so much!  I think we are converging...

    Camm> 1) The proposal under consideration is due to Carl, that gcl's lisp
    Camm> character still be governed by char-code-limit==256, i.e. equivalent to
    Camm> an uint8_t.  aref/aset work the same for all types of arrays.  This lisp
    Camm> character has no correspondence to a unicode character other than the
    Camm> overlap in the ascii range.  In some fashion, gcl would then provide on
    Camm> top of these primitives (unichar s i), etc. to get unicodes from utf8
    Camm> encoded strings.  These are not random access, but can be cached. So
    Camm> (code-char #xa0) != no-break-space.

Have you considered the cost of making gcl really rather incompatible
with other CLs?

Having (code-char #xa0) not be no-break-space is going to have be
explained to users.  I suspect mal-formed strings will be somewhat
common when someone accidentally stores a code-unit > 128 into a
string.

And why complicate thins with a cache? What was fairly simple now
depends on having a fast bug-free cache implementation.

--
Ray


_______________________________________________
Gcl-devel mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/gcl-devel

reply via email to

[Prev in Thread] Current Thread [Next in Thread]