groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] new backend for UTF-8 output


From: Markus Kuhn
Subject: Re: [Groff] new backend for UTF-8 output
Date: Tue, 11 Jan 2000 12:22:51 +0000

Werner LEMBERG wrote on 2000-01-10 17:54 UTC:
> > Here is a patch which enables groff to produce UTF-8 encoded Unicode
> > output, for use on ttys like the UTF-8 enabled xterm.  It introduces
> > a new option "-Tutf8", similar to "-Tlatin1".
> 
> Thanks a lot!  Markus Kuhn will also be happy to know that groff, one
> of the most ancient programs of the UNIX world still in use, has
> entered the Unicode arena.

Excellent!

Quick questions:

Does this mean that

  - groff will automatically make use of the correct U+2018/U+2019 single
    left/right quotation marks in its UTF-8 output and that all the
    characters shown on "man 7 groff_char" are now mapped correctly?
    Ideally, the characters that it would use in postscript output
    should be mapped to Unicode precisely according to

       http://partners.adobe.com/asn/developer/typeforum/unicodegn.html

  - Does man now produce automatically UTF-8 output when environment
    variable LC_CTYPE (or if that does not exist than LANG) contains
    the substring "UTF-8"? The ultimate goal should be that if we
    have LC_CTYPE=...UTF-8... that then the entire processing pipeline
    involved in showing man pages should switch to UTF-8, including
    xterm, less, man, groff, grotty, etc.

  - I understand that

       An  extension  to the troff character set for Europe, E.G.
       Keizer, K.J. Simonsen, J. Akkerhuis, EUUG Newsletter,
       Volume 9, No. 2, Summer 1989

    have extended the troff character set to cover Latin-1. Is there
    also a groff input syntax that allows me to enter any Unicode
    arbitrary Unicode character by hex code?

Eventually, we should get the following demos running correctly right
out-of-the-box:

  LC_CTYPE=en_GB.ISO8859-1 xterm -e man groff_char
  LC_CTYPE=en_GB.UTF-8     xterm -e man groff_char

Could you send me an example output of the new "groff -Tutf8" applied
to the groff_char(7) manpage?

> > The behaviour of existing backends is not changed.

There is one simple detail that definitely should be changed for the old
-Tascii and -Tlatin1 outputs:

  The characters ' and ` (apostophe and grave accent = 0x27 and 0x60) should
  both be represented by ' (0x27) in the "ascii" and "latin1" output. Newer
  fonts (e.g., all Microsoft/Apple/Adobe TrueType fonts, but also the new
  X11 fixed fonts) do NOT show `quote' as symmetric directional quotes
  any more, in accordance with ISO/Unicode. Directional quotation characters
  are only available in UTF-8 and Postscript output. For more information:

    http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

If there is a new groff beta release with the patch available, please
announce it also on address@hidden Thanks!

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]