groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] mom : unicode in .INCLUDE'd files


From: Steffen Nurpmeso
Subject: Re: [Groff] mom : unicode in .INCLUDE'd files
Date: Sat, 22 Jul 2017 18:02:41 +0200
User-agent: s-nail v14.9.0

Hey, Ingo,

Ingo Schwarze <address@hidden> wrote:
 |Steffen Nurpmeso wrote on Fri, Jul 21, 2017 at 10:30:36PM +0200:
 |
 |> In my humble opinion preconv has to go as such,
 |> i just do not know yet.  Just talking.
 |
 |So much talk...

SIIGHH!!!

 |In mandoc, i completed that work in October 2014:
 |
 |  http://mandoc.bsd.lv/cgi-bin/cvsweb/preconv.c#rev1.9
 |
 |  "commit message:
 |   integrate preconv(1) into mandoc(1); enhances functionality
 |   and reduces code and docs by more than 300 lines"

I think that is the right thing, because

 |Admittedly, mandoc only handles UTF-8 and ISO-LATIN-1, and
 |requires the user to convert files using obsolete encodings
 |to UTF-8 using iconv(1) first.

there is iconv(1), and either preconv catches it all or it is not
needed.  Probably there should be an .iconv request, but that is
probably too late ... anyway, looking at emacs tags cannot be the
solution for roff, in my opinion.

 |The only reason for supporting ISO-LATIN-1 is that many old manual
 |pages in the wild still use it, most even without saying so.
 |Otherwise, mandoc would be UTF-8 only on the input side.

I do not know.. i mean, the really good guys sat down and thought
about it at a dinner and were ready to go, but this was about 25
years ago and Unix is still not en par, so i do not think this can
be a solution for roff, which needs to deal with documents other
than manuals.  I have no idea about non-english (and a few de)
manual pages (but those were often not up-to-date, so i tried to
get rid of them asap anytime i rememember; note this really is an
*unfortunately*, because the very high-quality translations of
ISO C that i have from Prof. Dr. A.-T Schreiner and
Dr. Ernst Janich, typeset in roff, made me forget i ever needed
anything else!), i seem to remember other character sets.  But
since it can be automatized i guess it is ok for mandoc to
require that systems which switch perform an iconv run.

 |In the long run, i think that would be a reasonable direction
 |for groff, too.  Preconv is notorious for causing trouble for
 |casual users, see the several threads that were quoted earlier
 |in this thread, and even very experienced users often get
 |confused about how it works and where in the pipeline it is
 |supposed to be, see this thread itself.
 |
 |"Groff input always has to be UTF-8", that would be a very simple,
 |fool-proof principle.

I do not think this will be the direction to go.  I have no idea,
i really like to get Unicode on the input side, but a conversion
in front of that may be fine, as long as it is picked up
automatically in a way that the author of the document specifies.
And the emacs tags are in the world, and thus need to stay for
quite a while, too. :(

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]