groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Cyrillic (cp1251) support for groff -Tps


From: Werner LEMBERG
Subject: Re: [Groff] Cyrillic (cp1251) support for groff -Tps
Date: Tue, 24 Jul 2001 18:01:34 +0200 (CEST)

>     Use the \N'nnn' mechanism in groff (you have to in this case
>     since the mapfile was empty so there are no groff names for the
>     characters as yet). For this sort of thing I prefer to set up
>     macros, maybe job-specific, which invoke an IPA context and then
>     define characters like
> 
>     .char \[ng] \N'78'
> 
>     Or (say using  .tr IPA IPA_T ) simply
> 
>     .char \[ng] \f[IPA]\N'78'\fP
> 
>     which would enable you to drop in an IPA character on the fly
>     without having to switch to an IPA context. Also, you can easily
>     name them as you please at any time, without pre-empting names
>     you might need for something else.

While this may be sufficient for IPA, it is a bad idea to use \N'...'
generally since it mixes up glyph encoding with input encoding.  I'm
citing below another mail with Ruslan Ermilov (address@hidden) which
shows possible solutions to support koi8-r with a preprocessor -- as
you may have known or discovered, groff can't support koi-8 directly
as an input encoding (but only 8859-5) due to `illegal' characters
needed for internal use in groff.

I'm not sure currently whether this affects cp1251 also.


    Werner


PS: devkoi8-r is not part of default groff but an extension from
    Ruslan.  You won't need it to print Russian to a PS printer.

======================================================================

> As a consequence, direct koi8-r input is not possible currently.  My
> idea was to use \N'...' (assuming TTY output where groff's output
> encoding is a TTY's input encoding, so to say), but \[...] is better
> of course since it yields a cleaner interface, separating input from
> output encoding.  I will eventually remove the hardcoded `charXXX'
> character names since it intermixes input and output encoding.  What
> about a converter like this (choose better character names, please):
> 
>   koi8-r      glyph name
> 
>    0x80   ->   \[bdlh]  # box drawings light horizontal
>    0x81   ->   \[bdlv]
>    0x82   ->   \[bdldr]
>    0x83   ->   \[bdldl]
>    .
>    .
>    0xFF   ->   \[cyrVe] # cyrillic capital letter ve
> 
This is probably the only right, but long-standing solution, how
about this?

Implementing a small preprocessor which merely converts all non-
ASCII input characters to \[charXXX] sequences.  It is then made
to be auto-invoked by groff(1) by putting the ``prepro gro8to7''
in the {device}/DESC file.

To speed the things up, the preprocessor should probably only fix
the internally used characters (for which illegal_input_char()
returns true).

I have tried this technique with devkoi8-r, and I get the proper,
warning-free output from the raw (koi8-r) input.  (I have tested
all characters in the upper 0x80-0xff range.)

I think this is the best thing we could do before Groff 2.0.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]