groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Wishlist] Don't discard groff comments on HTML output


From: T. Kurt Bond
Subject: Re: [Wishlist] Don't discard groff comments on HTML output
Date: Thu, 27 Jan 2022 10:18:18 -0500

And troff comments appearing in html output as html comments is something I
explicitly DON’T want happening.  My comments are NOT intended to be part
of the finished document in any form.

On Mon, Jan 24, 2022 at 21:09 G. Branden Robinson <
g.branden.robinson@gmail.com> wrote:

> Hi Alex,
>
> At 2022-01-24T22:48:29+0100, Alejandro Colomar wrote:
> > Hi Branden,
> >
> > I'd like to see groff comments preserved in the HTML output (as HTML
> > comments).
> >
> > So, for `groff -T html ...`,
> >
> > .\" hello world
> >
> > would be transformed to
> >
> > <!-- hello world -->
> >
> > Sounds good?
>
> That's a bigger challenge than the other items you've raised so far
> (well, the grohtml relative inset thing, I can imagine being a real PITA
> to hammer out, but _conceptually_ it's easy).
>
> The problem is that troff(1) disposes of comments entirely very early in
> parsing.  Importantly, they're stripped out of macro definitions before
> the definition is even stored.
>
> It's possible these issues could be overcome by converting comments into
> a device control command escape sequence (\X''), but there are quoting
> issues to consider (although _maybe_ my recent change to how characters
> in such escape sequence get mapped when being written to the
> device-independent output addresses that, or makes doing so easier[1],
> and possibly other matters I haven't thought of.
>
> So this one is a heavier lift, I think.
>
> Regards,
> Branden
>
> [1] commit 9d61b3d142842589b90d7eda0ed3270fbbf6166f
> Author: G. Branden Robinson <g.branden.robinson@gmail.com>
> Date:   Fri Oct 1 19:20:25 2021 +1000
>
>     [troff]: Enable ASCII in device control escapes.
>
>     [troff]: Convert special character glyphs corresponding to Unicode
> Basic
>     Latin ("ASCII") code points to those code points when they occur in
>     device escapes.  (They should be correct for IBM code page 1047 as
> well,
>     but this is untested.)  This is necessary for encoding URLs in device
>     control commands.  Special character identifiers are presumed to be the
>     defaults documented in groff_char(7); this is a design gap that we
>     should consider addressing.  (We don't have a way to ask "is this the
>     special character corresponding to Unicode basic Latin code point X?")
>
>     * src/roff/troff/input.cpp (encode_char): Do it.
>
>     I'm not documenting this in NEWS as it feels like a pretty dusty corner
>     even though I'm about to leverage it for something of much higher
>     visibility.
>
> Also see:
> 65737d48ad7e75353a67e4f408bb68bc5d5b0773
> 3d1988cabc90f3c4b0b0000bb4a809be61eeba3c
> eb695ab2b5e2bae54afa102355c493bda6e29d3e
>
-- 
T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io


reply via email to

[Prev in Thread] Current Thread [Next in Thread]