groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Re: groff: radical re-implementation


From: Tomohiro KUBOTA
Subject: Re: [Groff] Re: groff: radical re-implementation
Date: Fri, 20 Oct 2000 01:37:43 +0900
User-agent: Wanderlust/1.0.3 (Notorious) SEMI/1.12.1 ([JR] Nonoichi) FLIM/1.12.7 (YĆ«zaki) Emacs/20.7 (i386-debian-linux-gnu) MULE/4.1 (AOI)

Hi,

At Thu, 19 Oct 2000 10:40:35 +0200 (CEST),
Werner LEMBERG <address@hidden> wrote:

> Note that such an encoding request has to determine the encoding *and*
> character set of a document (similar to Emacs).
(snip)
> Examples:
>   .\" -*- charset: JIS-X-0208; encoding: EUC -*-
>   .\" -*- charset: JIS-X-0208; encoding: ISO-2022 -*-

No.  only specifying 'encoding' is sufficient.  This is because
'encoding' includes information on which charset to be used.

For example, there are no encodings whose name is 'EUC'.
'EUC' is a generic name for EUC-based encodings (EUC-JP, EUC-KR,
EUC-CN, and EUC-TW).  'EUC' also means a method to build a
encoding which consists of at most four ISO2022-compliant charsets.

Yes, ISO-2022 is a name of encoding.  It consists of many charsets
such as ISO-8859-*, ISO-646-*, JIS-X-0208, KS-X-1001, GB-2312, and
so on so on.  There are many subsets of ISO-2022, such as ISO-2022-JP,
ISO-2022-KR, and so on.  EUC encodings are also subsets of ISO-2022.

Thus, when I specify encoding is ISO-2022-JP, it automatically
says that charsets are US-ASCII, JIS X 0201 (LeftHalf), 
JIS X 0208-1978, and JIS X 0208-1983.  When I specify encoding
is EUC-KR, it automatically says that charsets are US-ASCII and
KS X 1001.


> troff shouldn't notice encoding issues at all and just accept UTF-8.

Yes.

---
Tomohiro KUBOTA <address@hidden>
http://surfchem0.riken.go.jp/~kubota/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]