groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Status of the portability work, and plans for the future


From: Werner LEMBERG
Subject: Re: [Groff] Status of the portability work, and plans for the future
Date: Mon, 08 Jan 2007 15:40:19 +0100 (CET)

> > 1. If the input encoding has been explicitly specified, use it.
> >
> > 2. Otherwise, check whether the input starts with a Byte Order
> >    Mark.  If found, use it.

> > 3. Finally, check whether there is a known coding tag in either
> >    the first or second input line.  If found, use it.

> > 4. If everything fails, use a default encoding as given by the
> >    current locale, or `latin1' if the locale is set to `C',
> >    `POSIX', or empty.
>
> I'm willing to try to implement this protocol for doclifter, but it
> doesn't settle what the portability rule ought to be, which is our
> concern right at the moment.  What encoding(s) are we willing to
> count on third-party viewers to support?

Today, the input encoding of choice is UTF-8, of course.  Besides
this, the preconv program supports latin1 (because this is the
`native' encoding for groff, more or less).

Have a look into src/preproc/preconv/preconv.cpp, structure
`emacs_to_mime': The comment explains which input encoding sets are
worth to support today with tags.

Note that everything is piped through iconv to produce the form ASCII
+ `uXXXX' glyph entities.

> > Instead of using the groff's `uXXXX' glyphs, doclifter would
> > directly map to HTML entities.
>
> There may be a misunderstanding here -- doclifter never generates
> HTML entities.  Instead it generates ISO XML entities.  These sets
> do overlap, but they are neither formally nor actually
> identical. The HTML set is much smaller.

My mistake.  Anyway, I think XML also knows Unicode character
entities, right?  This is what I have meant.

> In fact, *all* defined groff-1.19 glyphs except the old Bell Labs
> bracket-pile graphics get mapped to ISO entities -- even the exotica
> like yogh and o-with-ogonek.

o-with-ogonek isn't an exotic letter at all!  All Poles will object to
your assertion :-)

> > Well, I won't change groffer.man -- this is his contribution.
>
> Uh oh.  You just invoked my hacker-anthropologist mode...I've seen
> this kind of talk before and the results tend to be *bad*.  [...]

Whatever decision we will find, I won't force anything right now.
Maybe later.  Thus I don't evade a decision but postpone it.

> It's possible that "no change" is the right answer, but because it's
> "his contribution" is not a sufficient reason.  As the project lead,
> you have the responsibility to make a decision on factual and
> technical grounds.  [...]

Hmm.  To exaggerate, the only `technical ground' currently is that
doclifter can't handle it.  Up to now nobody has ever claimed problems
with groffer.1 -- while I understand your arguments, I don't see an
urgent need to react immediately.

> > It seems that grohtml does a quite decent job for this man page:
> > What about putting it into an exception list (even if it is the
> > only member) so that it is converted with `groff -Thtml' instead
> > of doclifter?
>
> Werner, in situations like this, exception lists frighten the shit
> out of me.

:-)  Nice phrase.

> The problem is that once it is known that you have one, people
> invent all sorts of clever, plausible reasons they should be on it
> rather than doing the bit of extra work needed for a clean solution.
> [... omitting shameless exaggerations ...]

According to your analysis, groffer.1 is basically the only candidate
which is not going to be fixed easily -- for whatever reasons.  Not
bad to have just one single exception out of 10000...

It's not necessary to tell anyone that an exception list exists :-)

> And even for pages that can't be strictly viewer-portable, simplifying
> them to the point where doclifter can lift them will have benefits.

Uh, oh, I'm not comfortable with `simplifying until doclifter can
handle it'.  It's still us who are setting up the rules, not a
program.  How many MS Word users do the craziest things just to make
this wacky program handle their documents...  Let's not argument like
that.

> It's interesting that you picked groff_char.man as an example,
> because I can tell you this: there is no reason in the universe we
> should be unable to generate good XML-DocBook from that page.

Indeed, there's nothing special in it except a large bunch of glyphs
which can't be displayed on all output devices.  However, to access
them properly, I need groff extensions not available in AT&T troff.

> > Ideally, they should use groff for formatting (opening a TTY
> > window showing `man' output would be sufficient IMHO) if the
> > number of problems exceeds a certain threshold.
>
> And that's an excellent idea for a general fallback.

groffer.1 comes to my mind :-)


    Werner




reply via email to

[Prev in Thread] Current Thread [Next in Thread]