[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: find-coding-systems inconsistencies
From: |
Kenichi Handa |
Subject: |
Re: find-coding-systems inconsistencies |
Date: |
Thu, 4 Dec 2003 08:44:30 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) |
In article <address@hidden>, Jesper Harder <address@hidden> writes:
> Consider the following three similar ways of asking the same question:
> 1. (find-coding-systems-for-charsets (list (char-charset ?)))
> => (raw-text emacs-mule)
This is wrong. find-coding-systems-for-charsets was written
before supporting utf-X and should be fixed now.
> 2. (find-coding-systems-string "")
> => (undecided)
> 3. (with-temp-buffer
> (insert "")
> (find-coding-systems-region (point-min) (point-max)))
2 and 3 are diffrerent operations.
2a. (find-coding-systems-region (string-make-multibyte "\235"))
3a. (with-temp-buffer
(insert "\235")
(set-buffer-multibyte nil)
(find-coding-systems-region (point-min) (point-max)))
2 and 3a, 3 and 2a are the same operations respectively.
And, the result of 3 (and 2a) depends on the language
environment. For instance, in Bulgarian lang. env., the
result contain windows-1251 and etc. That is because
(string-make-multibyte "\235") returns a string that
contains a Cyrillic character in the charset
mule-unibyte-0100-24ff.
> And is \235 really encodable by utf-8?
Yes, because Unicode explicitly contains C1-control
characters.
---
Ken'ichi HANDA
address@hidden