groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Groff] Unicode, EBCDIC, Latin-2, JIS for groff


From: Werner LEMBERG
Subject: [Groff] Unicode, EBCDIC, Latin-2, JIS for groff
Date: Fri, 10 Mar 2000 18:39:00 GMT

It's amazing to see that people are interested in having Unicode
resp. EBCDIC input within gtroff.

Other people want Latin-2, others again want Japanese...

How to handle this best?

My suggestion is to enlarge gtroff so that it can handle arbitrary
31bit characters (this covers ISO 10646).  Characters with the 32nd
bit set (i.e. negative numbers) can then be used for special gtroff
`characters' like `ESCAPE_c'.

It should use Unicode (resp. ISO 10646) as the internal encoding and
nothing else.

Question: How far is the project of Unicode input?

Additionally, I suggest to use UTF8 exclusively as the external
encoding representation if, say, the command line option `-u' is used.

Groff should then come with a character set conversion tool (as a
preprocessor; maybe with heuristics to recognize the proper encoding?)
to map everything to Unicode in UTF8 representation (e.g. Latin-2, JIS
-- EBCDIC charsets also).

On the output side, I think that no essential changes are necessary
(except better support for very large fonts since gtroff's font handling
mechanism isn't very efficient here).  Of course, grops e.g. should be
extended to support CID-keyed PS fonts.

Comments please.


    Werner


reply via email to

[Prev in Thread] Current Thread [Next in Thread]