[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Turning HTML character references into something readable?
From: |
Reiner Steib |
Subject: |
Re: Turning HTML character references into something readable? |
Date: |
Mon, 28 Apr 2003 15:02:48 +0200 |
User-agent: |
Gnus/5.09002 (Oort Gnus v0.20) Emacs/21.3 (gnu/linux) |
On Sun, Apr 27 2003, Karl Eichwalder wrote:
> Benjamin Riefenstahl <Benjamin.Riefenstahl@epost.de> writes:
>
>> Actually that literal seems to be in some JIS encoding on my side,
Same here:
,----[ `C-u C-x =' ]
| character: [ removed "mirrored `R'" ] (0151701, 54209, 0xd3c1)
| charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87)
| code point: 39 65
| syntax: word
| category: Y:Cyrillic characters of 2-byte character sets j:Japanese
| |:While filling, we can break a line at this character.
| buffer code: 0x92 0xA7 0xC1
| file code: ESC 24 42 27 41 (encoded by coding system iso-2022-jp-2)
| font: -Misc-Fixed-Medium-R-Normal--14-130-75-75-C-140-JISX0208.1983-0
`----
What does `C-u C-x =' say on that character before sending?
>> while Я indicates Unicode.
>
> Gnus decided to turn it into JIS; initially it was Unicode/UTF-8.
I don't think that Gnus is able to convert UTF-8 to JIS. Running
`find-coding-systems-region' in your message shows that Emacs 21.3
doesn't list any UTF coding-system. This is basically what Gnus does
in the function `mm-find-mime-charset-region' in `mm-util.el'.
>> (char-to-string (decode-char 'ucs 1071))
When I insert this char into the buffer...
(insert (char-to-string (decode-char 'ucs 1071))); Я
... and use...
(setq mm-coding-system-priorities nil) ;; default
... I get iso-8859-5.
With my setting of...
(setq mm-coding-system-priorities '(iso-latin-1 iso-latin-9 mule-utf-8))
... I get utf-8.
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/
Re: Turning HTML character references into something readable?, Colin Marquardt, 2003/04/28