groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about Unicode Greek


From: Robert Goulding
Subject: Re: Question about Unicode Greek
Date: Fri, 12 Feb 2021 10:56:42 -0500

I'm sure this is a dumb question, because I don't entirely get how these
encodings work yet - but why can't groff refer directly U1F10, but instead
breaks it down into U03B5 and U0313?

Also man groff_char says "These [Greek] glyphs are intended for technical
use, not for real Greek," and the lower case are all slanted, upper case
upright. So, if someone did want to use groff for actual Greek language
(modern or ancient) typesetting, that would have to be addressed too.



On Fri, Feb 12, 2021 at 1:13 AM G. Branden Robinson <
g.branden.robinson@gmail.com> wrote:

> Hi Robert and Steffen,
>
> At 2021-02-11T23:03:47+0100, Steffen Nurpmeso wrote:
> > Robert Goulding wrote in
> >  <CACE7msuTMpqMaMg8c9m1AeBJOt1DtwFQtd0f_vp10i8vZLvjTw@mail.gmail.com>:
> >  |I've been away from groff for a long time; I think the last time I
> used it,
> >  |there was no Unicode support at all. Now I'm interested in using it as
> a
> >  |filter from markdown, through pandoc to groff to pdf.
> >  |
> >  |This is working well for me, except for a handful of files in which I
> use
> >  |Greek with accents. I understand that groff doesn't have characters for
> >  |accented Greek characters, and I'm willing to do the work to add them,
> I'm
> >  |just trying to understand what's involved.
> >  |
> >  |So, here is a tiny document with some Greek in it:
> >  |
> >  |.LP
> >  |ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος
> >  |
> >  |When I run this through preconv, I get the following:
> >  |
> >  |.lf 1 rubbish.ms
> >  |.LP
> >  |\[u1F10][...]
> >  |
> >  |with all of the Unicode characters turned into the correct code
> numbers.
> >  |When I run this through groff -ms -Tps I get the following errors:
> >  ...
> >  |This is what is puzzling me. The very first letter, ἐ, is correctly
> given
> >  |its unicode description \[u1F10] by preconv; but then troff seems to
> >  |decompose it into \[u03B5] which is ε and \[u0313] which is ̓ . So, if
> I
> >  |wanted to tell groff how to print ἐ, how do I go about it, when there
> seem
> >  |to be two internal representations?
> >
> > It seems the groff source repository contains the necessary update
> > in the afmtodit tables to include this character for non per-se
> > Unicode aware output devices.  A new release will ship it thus.
> > You could try to update the %AGL_to_unicode hash in
> > /usr/bin/afmtodit of your installed groff accordingly, too.
>
> I am not so sure this _is_ fixed.  Interestingly, it works for the grotty
> output driver but not grops.  Here's what I get with groff git HEAD.
>
> $ ./test-groff -Tutf8 -k -ms EXPERIMENTS/greek.ms | cat -s
>
> ἐν  ἀρχῇ  ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς
> ἦν ὁ λόγος
>
> $ ./test-groff -Tps -z -k -ms EXPERIMENTS/greek.ms
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03B5_0313'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03B1_0313'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03B7_0342_0345'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03B7_0313_0342'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03BF_0314'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03BF_0301'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03B9_0300'
> troff: backtrace: file 'EXPERIMENTS/greek.ms':2
> troff: EXPERIMENTS/greek.ms:2: warning: can't find special character
> 'u03BF_0300'
>
> Moreover I don't recall any update to the afmtodit tables that covered
> these sort of character combinations.  We (I) did update them to capture
> some new code points from Unicode 13.0[1] and to kern the ellipsis
> correctly[2].
>
> Shouldn't the output driver (grops) be taking these NFD-decomposed
> sequences and building the combined glyphs with overstriking?
>
> Also, the PostScript output seems to be rendering all the Greek letters
> slanted instead of upright.  Surely that's not correct?
>
> Regards,
> Branden
>


-- 
Robert Goulding
Director, John J. Reilly Center for Science, Technology, and Values;
Director, Program in History and Philosophy of Science;
Assoc. Professor, Program of Liberal Studies,
Fellow, Medieval Institute,
University of Notre Dame.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]