groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Groff] Bullets in manual pages and -K groff option


From: Alexander E. Patrakov
Subject: [Groff] Bullets in manual pages and -K groff option
Date: Wed, 25 Jan 2006 18:05:11 +0500
User-agent: Debian Thunderbird 1.0.2 (X11/20051002)

Hello developers,

this mail is related to the previous message by Werner LEMBERG where he documented the -K option of the CVS version of groff. This option allows one to specify the input encoding, and one can use construction like the following in order to get a formatted manual page as UTF-8 output:

groff -K input_charset -Tutf8 -mandoc /path/to/manual/page.1

And, if the manual page is ISO-8859-1 encoded, the -K option is not needed.

UTF-8 locale users can stick this into their man.conf and be happy. But what about those who prefer to use old-style 8-bit locales? Groff cannot output ISO-8859-X where X != 1. I tried to model how various non-UTF-8 users would do this, using the English hosts_access.5 manual page from the tcpwrappers package, which should be viewable from any locale.

1) The old -Tlatin1 hack mostly works. But it's still a hack.

2) The following command attempts to convert Groff UTF-8 output to the locale charset:

groff -Tutf8 -mandoc hosts_access.5 | iconv -f UTF-8

but this chokes on the first hyphen. Let's attempt to tell iconv to use the best approximation:

groff -Tutf8 -mandoc hosts_access.5 | iconv -f UTF-8 -t //TRANSLIT

This is better, but still not ideal. Details:

in ISO-8859-1 based locales, everything looks good, but quotes differ from what gets printed with the -Tlatin1 switch.

in ISO-8859-{3,7,8,9,10,13,15,16} and KOI8-* based locales, the bullets become bullets! A huge improvement over the -Tlatin1 hack.

in ISO-8859-{2,4,5,6,11,14} and TIS-620 based locales, the bullets are replaced with question marks. Well, they were not bullets either with the -Tlatin1 hack. But the question mark is simply not right.

So my question is how to avoid this. The answer "use -Tascii for such manual pages" won't be accepted until Man stops using one Groff line for all manual pages: -Tascii damages Polish manuals. The answer "patch glibc so that iconv transliterates the bullet to 'o'" is better (and in fact this is doable), but I think that users of non-Glibc systems (or old Glibc) will complain if this becomes the official answer.

So: what is the official recommendation upon formatting manual pages in non-ISO-8859-1 non-UTF-8 locales with the CVS version of Groff?

--
Alexander E. Patrakov




reply via email to

[Prev in Thread] Current Thread [Next in Thread]