[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] new backend for UTF-8 output
From: |
Markus Kuhn |
Subject: |
Re: [Groff] new backend for UTF-8 output |
Date: |
Tue, 11 Jan 2000 12:22:51 +0000 |
Werner LEMBERG wrote on 2000-01-10 17:54 UTC:
> > Here is a patch which enables groff to produce UTF-8 encoded Unicode
> > output, for use on ttys like the UTF-8 enabled xterm. It introduces
> > a new option "-Tutf8", similar to "-Tlatin1".
>
> Thanks a lot! Markus Kuhn will also be happy to know that groff, one
> of the most ancient programs of the UNIX world still in use, has
> entered the Unicode arena.
Excellent!
Quick questions:
Does this mean that
- groff will automatically make use of the correct U+2018/U+2019 single
left/right quotation marks in its UTF-8 output and that all the
characters shown on "man 7 groff_char" are now mapped correctly?
Ideally, the characters that it would use in postscript output
should be mapped to Unicode precisely according to
http://partners.adobe.com/asn/developer/typeforum/unicodegn.html
- Does man now produce automatically UTF-8 output when environment
variable LC_CTYPE (or if that does not exist than LANG) contains
the substring "UTF-8"? The ultimate goal should be that if we
have LC_CTYPE=...UTF-8... that then the entire processing pipeline
involved in showing man pages should switch to UTF-8, including
xterm, less, man, groff, grotty, etc.
- I understand that
An extension to the troff character set for Europe, E.G.
Keizer, K.J. Simonsen, J. Akkerhuis, EUUG Newsletter,
Volume 9, No. 2, Summer 1989
have extended the troff character set to cover Latin-1. Is there
also a groff input syntax that allows me to enter any Unicode
arbitrary Unicode character by hex code?
Eventually, we should get the following demos running correctly right
out-of-the-box:
LC_CTYPE=en_GB.ISO8859-1 xterm -e man groff_char
LC_CTYPE=en_GB.UTF-8 xterm -e man groff_char
Could you send me an example output of the new "groff -Tutf8" applied
to the groff_char(7) manpage?
> > The behaviour of existing backends is not changed.
There is one simple detail that definitely should be changed for the old
-Tascii and -Tlatin1 outputs:
The characters ' and ` (apostophe and grave accent = 0x27 and 0x60) should
both be represented by ' (0x27) in the "ascii" and "latin1" output. Newer
fonts (e.g., all Microsoft/Apple/Adobe TrueType fonts, but also the new
X11 fixed fonts) do NOT show `quote' as symmetric directional quotes
any more, in accordance with ISO/Unicode. Directional quotation characters
are only available in UTF-8 and Postscript output. For more information:
http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
If there is a new groff beta release with the patch available, please
announce it also on address@hidden Thanks!
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>