groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Groff] Bugs in mm, accents, multi-line macros and font glyp


From: Ted Harding
Subject: RE: [Groff] Bugs in mm, accents, multi-line macros and font glyp
Date: Tue, 08 Jan 2002 16:49:40 -0000 (GMT)

Hi Alejandro,

On 08-Jan-02 P. Alejandro Lopez-Valencia wrote:
> When using multi-line macros such as .WA/.WE and .IA/.IE,
> the acent macros do not work, while using high bit latin1
> codes does. I am sure this happens in other macros as well.
> See the attached example files.

I am not sure why this does not work. However, it is not
just the "accent macros" which do not work: the whole
string sequence "\*:" etc is output literally in your
PDF file, i.e. "\*" is not being interpreted as the lead-in
for a defined string reference. This must have something 
to do with the definition of .WA etc., which I have not had 
time to investigate to the point of finding out why it happens.

However, I would also point out, at this point, that .mm
partially includes the "improved accent definitions" which
can be found in the ms macros (written by James Clark).
In the ms macros, you need to invoke the macro ".AM" in order
to activate these; in the mm macros, it looks as though these
are invoked automatically when the macros are read in.

These accent macros allow a flexible definition of arbitrary
accented characters, using the ".char" mechanism. However, this
is a two-stage process: first define the string for an accent-over
or an accent-under; then define any character you need with this
accent. For example, we already have

  address@hidden ' \'

which define a string whose name is "'" which makes an acute
accent-over. You can then define a character with name "o'"
(or whatever else you like to call it) by

  .char \(o' o\*'

which will give you your "o-acute" ó.

Similarly, there already is

  address@hidden , \(ac

which defines a string whose name is "," which makes a
cedilla accent-under. Then you can define, for example,

  .char \(S, S\*,
  .char \(t, t\*,

for upper-case S-cedilla (needed in Turkish and elsewhere
but not available in the ISO-8859-1 (European encoding)
though available in ISO-8859-9 (Turkish); and for the
t-cedilla (needed in Romanian, for which you need yet another
ISO-8859-* encoding); and so on.

You can also define your own accent-overs and accent-unders,
such as

  .acc*over-def dot \(a.       (dot-above)
  .acc*over-def breve \(ab     (breve as in Turkish "uma\(s,ak g")
  .acc*over-def ring \(ao      (as in Danish Å and Czech u-ringabove)
  .acc*over-def ; \(a"         ("Hungarian umlaut")
  .acc*under-def . \s[\En[.s]*8u/10u]\v'.2m'.\v'-.2m'\s0 (dot-under)

and characters like

  .char \(z. z\*[dot]          (z dot-above)
  .char \(gu g\*[breve]        (uamasak-g)
  .char \(u@ u\*[ring]         (u with ring above)
  .char \(o" o\*;              (o with Hungarian umlaut)
  .char \(D! D\*.              (D with dot under)

Then what you need to type when entering the source text is
written like

  L\(o'pez
  \(S,imde
  Mul\(t,umesc
  Mu\(z.
  De\(guil
  P\(address@hidden
  K\(o"sz\(o"n\(o"m
  \(D!arma

and so on -- you can now write correctly in several languages.
All of the above, as it happens, relies on the presence in
the PostScript standard fonts of a variety of separate accent
glyphs (15 according to my count) in addition to the many
glyphs for accented letters. So, if limited to standard
PostScript fonts, you can cover just about any European
language (and most others which use a Latin alphabet).
Furthermore, however, you can use the above mechanism to
place any available glyph as an accent for any other available
glyph, so you could extend your repertoire to other languages
by (relatively slightly) extending your available fonts..

You can also define your own accent glyphs by little bits
of in-line PostScript code (using the "\X'ps: exec ... '"
mechanism), or by groff's line-drawing commands ("\D'l ...'"
etc.).

> I also noticed the accent macros in mm and ms (possibly in me
> too, but I never have used those) are using the old practice
> of building composite characters, instead of calling the already
> existing accented glyphs in the fonts (I see this as useful only
> in the dvi driver and only if there is no migration to the EC/TC
> fonts). The problem of fonts with 7 bit character sets is something
> of the past, except in the TeX world.

Here I respectfully disagree with you. Your argument holds only
(and this is as far as it goes) when you are using languages
for which you already have adequate fonts. To cover all European
languages with all their accented characters as single glyphs
you will need several fonts; there is not enough room in
8 bits for everything.

I also suspect that you may be making an underlying assumption
that you can cover all you need with 8 bits (apologies if you
are already thinking in Unicode!) -- and if this is so then
the above examples show that this is not adequate.

Also, I believe you have a limited view of groff's accent resources
(i.e. you are thinking solely in terms of "o\'" and so on, using
the old original troff escape sequences). This, as explained above,
has been extended to a general accent-building mechanism.

The point of the accent-building mechanism described above is
that you can build up anything you need provided it can be
done using glyphs which are available, without the need to
get hold of special fonts.

> One can argue that there is no real problem, but, no matter
> how well one calculates the placement of accents on composite
> characters, this doesn't beat---usually---, the good sense
> of a type designer.

Well, that is arguable; but I think that in practice a careful
refinement of the composite character definition (see for example
the definition of dot-under above) can make it very difficult to
see that it has been done this way. (I remember once modifying the
above definition of dot-under so that it looked right for Italic
characters). I have found it useful to view created composite
characters in PostScript at a very high zoom factor.

> As well, the resulting composite characters are not searchable
> if one creates PDF as final output from the ps driver.

That argument only works, I think, if you are in a unique
font-encoding (unless, again, you are thinking in Unicode).
Otherwise, I don't see how can begin to search for the
character even if it has been incorporated as a single glyph.

So please don't knock groff's accented-character resources.
If you can make all that you need in terms of the glyphs
in fonts which you already have available, then by all
means do so: using the ".char" mechanism you can map
any escape sequence you choose to a given glyph in some
font. Otherwise, people will need the capacity to extend
their repertoire of glyphs by creating composite characters
as above; and for this there is, in principle, no limit
within groff.

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <address@hidden>
Fax-to-email: +44 (0)870 167 1972
Date: 08-Jan-02                                       Time: 16:49:40
------------------------------ XFMail ------------------------------

reply via email to

[Prev in Thread] Current Thread [Next in Thread]