groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Installing Russian Type-1 Fonts


From: Anton Shepelev
Subject: Re: [Groff] Installing Russian Type-1 Fonts
Date: Sat, 20 Aug 2011 17:57:12 +0400

Thank  you  for  the  kind  and patient explanation,
Werner.

> However, please avoid the term  'AGL  compatible'.
> We  are not talking about glyphs but about charac-
> ters!

Maybe this confusion is not only my fault  also  the
manual's:

    The  distinction  between input, characters,
    and output, glyphs, is not clearly separated
    in  the  terminology  of groff; for example,
    the char  request  should  be  called  glyph
    since it defines an output entity.

                       Groff User Manual, Chapter 5.
                    A footnote to the description of
              the name field in the font file format

As I now understand, there's no internal representa-
tion of characrers in groff. There  are  only  input
characters  and output entities. The former ones are
found on the input stream, sometimes  together  with
escape sequences specifying output entities directly
-- like \[uXXXX] with Russian UTF-8 input after pro-
cessing  by  preconv.  The latter ones are stored in
groff's intermediate output and read in by  postpro-
cessors.  If  the  postprocessor  is not targeting a
character-cell device, then  these  output  entities
are  also called glyphs, but they are not to be con-
fused with, say, the glyphs of  a  PostScript  font,
about  which groff itself knows nothing and it's the
grops postprocessor that, using its  font-definition
files,   converts  groff's  glyphs  into  PostScript
glyphs.

The Groff Glyph List (GGL) is just a  fixed  set  of
glyph   identifiers  without  a  predefined  mapping
either from input characters, which  is  defined  by
character  translation  requests like .trin in groff
source files, or to the  symbols  in  the  resulting
document,  because  it  is  up  to the postprocessor
whether (and how) to interpret them.

It seems to me that the GGL was created to provide a
default  support for 8-bit encodings that would work
out-of-the-box, and to have meaningful  indentifiers
for the symbols of the non-ASCII part of the Latin-1
encoding, thereby standartizing the names  of  these
8-bit symbols across all postprocessors. It probably
came into existence when the  hard-coded  dependency
on  Latin-1  was removed, because now the font files
had  to  substitute  something   for   glyph   names
\[char128]-\[char255] which they had relied upon.

Am I correct in suggesting that the Adobe Glyph List
algorithm is used in afmtodit?

> Contrary to TeX, groff handles hyphenation  before
> the  conversion from characters to glyphs has hap-
> pened (more or less).

More or less, because the  input  file  may  already
contain   escapes  for  addressing  output  entities
directly, in which case groff has to convert them to
'phantom'  input  characters which were never on the
input yet must be used for hyphenation.

> > But generally,  this  map  cannot  be  inversely
> > applied  becuase several input characters may be
> > mapped into one internal entity. What does groff
> > do in this case?
>
> Please  give  me an example where this is relevant
> to hyphenation.

At the stage of converting output entities  back  to
8-bit  input  characters. Although this situation is
unlikely, it is possible and groff does not seem  to
complain  about  it. In other words, for hyphenation
to work,  the  character  translation  map  must  be
(1:1), while generally it is (n:1).

An error in the mapping file, like this:

    .trin a\[u0430]
    .trin b\[u0430]

makes  it  impossible  for  groff  to  calculate the
hyphenation code for \[u0430], yet otherwise such  a
setup using UTF-8 input remains fully functional.

The  folowing  seemingly harmless and effectless two
commands disable hyphenation at character 'a':

    .trin ta\" affect character translation
    .trin tt\" resore character translation

It maybe caused by the backwards associative  array,
that  is -- from output entities to input characters
-- not being updated properly and  groff  no  longer
being able to compute the hyphenation code for 'a'.

Here's a full example for -Tascii:

    .pl 100v
    .ll 50n
    Here, 'character' will be hyphenated at 'a':
    .br
    .ll 5n
    character
    .sp 1v
    .ll
    And here, it will be not:
    .trin ta\" affect character translation
    .trin tt\" resore character translation
    .br
    .ll 5n
    character
    .pl 0

Anton



reply via email to

[Prev in Thread] Current Thread [Next in Thread]