[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Meta-Characters, Special Characters
From: |
Gernot Hassenpflug |
Subject: |
Re: Meta-Characters, Special Characters |
Date: |
Sun, 03 Jun 2007 00:39:28 +0900 |
User-agent: |
Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.95 (gnu/linux) |
David Kastrup <dak@gnu.org> writes:
> Gernot Hassenpflug <gernot@yahoo.com> writes:
>
>> Miles Bader <miles@gnu.org> writes:
>>
>>> Gernot Hassenpflug <gernot@nict.go.jp> writes:
>>>> I am happy to note that Windows too stores its iinformation in UTF-8
>>>> internally, no matter what the user's settings for a particular
>>>> program may be.
>>>
>>> I thought windows used something a bit more annoying and ad-hoc, UCS-16
>>> or something like that.
>>
>> Oh, you may be right there, I should have qualified my statement: as
>> opposed to a Windows-specific charset I think Windows uses a
>> universal charset. I am not sure why UCS-16 is more ad-hoc than
>> UTF-8, but I would be more than happy if linux instead of UTF-8
>> moved to UTF-16 or UTF-32, in view of the many charsets I need in my
>> work. I am not nearly educated enough on this topic to hold a
>> coherent conversation however, still reading. -- Grrr!! ...Pick a
>> reason...
>
> As soon as you leave the UTF-16 base plane, you need to deal with
> surrogate character pairs. The issues are pretty much the same as
> when dealing with UTF-8, and you get the additional complications of
> wide characters, quite more conspicuous byte order marks, Endianness
> portability problems and so on.
>
> In short: this buys you positively nothing unless you restrict
> yourself to the base 16-bit subset (which makes this infeasible for a
> number of tasks). And even then, the disadvantages are not really in
> a good balance with the advantages.
Thanks for the explanation. In view of this, I assume at least some
experts are exploring the possibility of introducing 16-bit
bytes. Problems with legacy systems are probably unsurmountable at
present though...
--
Grrr!! ...Pick a reason...