[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: coding systems and input methods are non-intuitive stuff
From: |
David Kastrup |
Subject: |
Re: coding systems and input methods are non-intuitive stuff |
Date: |
Tue, 30 Jan 2007 09:35:17 +0100 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) |
Kevin Rodgers <address@hidden> writes:
> Juanma Barranquero wrote:
>> C-x b *scratch* RET
>> C-x RET f latin-1 RET ; buffer coding system = latin-1
>> C-u C-\ romanian-prefix RET ; input method: romanian-prefix
>> ,s ; character: ş (2362, #o4472, #x93a,
>> ; U+015F)
>> <left> M-x quail-show-key RET ; To input `ş', type ",s"
>> <right>
>> C-x RET f utf-8 RET ; buffer coding system = utf-8
>> ; input method: the same as before
>> ,s ; character: ş (331903, #o1210177,
>> ; #x5107f, U+015F)
>> <left> M-x quail-show-key RET ; ş can't be input by the current
>> ; input method
>>
>> Now, I understand that the buffer code for these characters is not the
>> same... but it is quite weird nonetheless to input a character with
>> the current input method, and afterwards be told that it "can't be
>> input by the current input method".
>
> The thing that confuses me is that the ISO 8859-1 character set (which
> is what the latin-1 coding system encodes, right?) only contains U+0000
> - U+00FF. So how does U+015F get inserted into a latin-1 buffer?
There is no such thing as a "latin-1 buffer" in Emacs. Buffers are
always encoded in emacs-mule (well, we still have something called
"unibyte" buffers, but they are really only for binary data). The
buffer contains characters. Code points of a particular coding system
are only associated when saving, loading, communicating with
processes, X selections, networks, keyboards, terminals. All those
operations have their own coding systems. The buffer itself hasn't.
--
David Kastrup