groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Eric Raymond on groff and TeX


From: Steve Izma
Subject: Re: [Groff] Eric Raymond on groff and TeX
Date: Tue, 8 May 2012 00:06:07 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Mon, May 07, 2012 at 03:06:17PM -0400, James K. Lowden wrote:
> Subject: Re: [Groff] Eric Raymond on groff and TeX
> 
> > ... An in-line element (emphasis, small
> > caps, superior numbers) not only needs surrounding white space
> > (or lack of it) detected and preserved, it also breaks up the
> > enclosing block, leaving a tail (depending on the kind of parser
> > you're using). 
> 
> What is "tail" here, please?  I thought I understood until then.  

Sorry about this; it's actually hard to explain without a few
examples, which is probably too much for this space. To give a
simple answer, a tail is the part of a paragraph, for example,
that follows an interuption like a few words in emphasis. It's
relatively easy to process contiguous data within an element, but
you need to do more work to connect things together when they are
broken up by other elements. Typographically speaking, it's very
important to properly detect and handle the whitespace around the
interuption.

I've made a lot of notes about this, and I promise that soon I
will try to document this and other issues that make XML to groff
processing tricky.
 
> > So far I have always needed to detect and define
> > separately whatever in-line elements a document uses, which
> > means that writing a general-purpose formatter for XML seems
> > virtually impossible.
> 
> Why is it not possible to divide and conquer?  If we know the set of
> tags and every tag is either block or inline (never both), why can't a
> dictionary of tag properties permit uniform handling of all in-line
> elements?  

That's exactly what you need to do, but it's not general-purpose
because each project that has a different DTD would require
rewriting the dictionary to include the inline tags for that DTD.
As far as I know, there's no conventional way of flagging inline
tags in DTDs or schemas. E.g., typical ways of tagging emphasis:
<i>, <e1>, <italic>, <emphasis>, <emphasis type="bold">.

By the way, using groff as part of a pipeline that includes a
python script for XML parsing is blindingly fast. On my three- or
four-year-old computers I can process a 200 page book in a few
seconds. This means, using vi, for example, that you can make a
correction in a file, hit a memory key that runs the groff
pipeline, and have a postscript viewer (I use okular these days)
watch the file for almost instantaneous updates.

        -- Steve

-- 
Steve Izma
-
Home: 35 Locust St., Kitchener N2H 1W6    p:519-745-1313
Work: Wilfrid Laurier University Press    p:519-884-0710 ext. 6125
E-mail: address@hidden or address@hidden

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?
<http://en.wikipedia.org/wiki/Posting_style>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]