groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] preconv, unicode, greek and indexing


From: Werner LEMBERG
Subject: Re: [Groff] preconv, unicode, greek and indexing
Date: Mon, 21 May 2007 10:46:14 +0200 (CEST)

> I have tested the new preprocessor with the groff options
> -K<encoding> and -k
> 
> This is all very encouraging.

Nice to hear that!  However, you should upgrade to the current CVS
(from today); I've just found a serious bug in subfont handling which
made grops sometime insert a character where there shouldn't be any
(making the particular line too long additionally).

> Although it is possible to format everything with Kerkis, I have
> chosen to use Kerkis for greek only, when I switched to Kerkis I
> used \f(KR...\fP.  It didn't work at all -- groff tried to take all
> greek characters from the special fonts S and SS.

This is normal behaviour.

> Then I switched font family temporarily using \FK .. \F[].

This is the correct way.

> Then I got partial success: All diacritical characters come out as
> KR, but the remaining are taken from SS (slanted special
> characters).

Please give an example for further investigation.

> In order to get this printed correctly, I need to reset the system
> of special characters. How do I do that (Problem 1)?

This has already been answered on the list.

> author:\[u00C5]str\[u00F6]m, P. ... 46
> 
> In order to sort my titles, authors, place names I need to translate
> this back to utf-8. How do I do that (Problem 2)?

Try the groff2uni perl script below.  Note that you will get warnings
like

  Wide character in print at groff2uni.pl line 17, <> line 8.

Any Perl expert here who can fix that?  

> Finally I have a number of greek names and they are not emitted at all. 
> How do I go about to get hold of them (Problem 3)?

What is special about `greek names'?  I fear I don't understand your
question.  Please give an example.


    Werner


======================================================================


#! /usr/bin/perl -w
#
# groff2uni.pl
#
# Convert groff unicode entities of the form \u[XXXX] back to Unicode.
#
# Usage:
#
#   perl groff2uni.pl < infile > outfile
#
# You need perl 5.6 or greater.

use strict;

while (<>) {
  s/\\\[u([0-9A-F]{4,6})\]/chr(oct("0x" . $1))/eg;
  print;
}

# EOF




reply via email to

[Prev in Thread] Current Thread [Next in Thread]