[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Need help with pdfmark
From: |
Keith Marshall |
Subject: |
Re: [Groff] Need help with pdfmark |
Date: |
Wed, 13 Oct 2010 13:43:08 +0100 |
User-agent: |
KMail/1.9.10 |
Hi Larry,
On Tuesday 12 October 2010 19:12:43 Larry Kollar wrote:
> I've been successfully using pdfmark to generate PDFs with bookmarks
> for some time now, no problem. Now I'm trying to make my
> cross-references into actual PDF links, and that's where I'm running
> into trouble. Part of the problem, I suppose, is the part of the
> pdfmark documentation I need has not been written. :-)
Yeah, it's a rather unsatisfactory situation. Unfortunately, with my
present day-job work load, it's likely that this will become a project
for my retirement -- at least another year away, and maybe as many as
six more years. :-(
> The other part
> is that I'm trying to graft this feature into an existing process, so
> I can't just rip it out and start over with pdfroff.
Okay, so it's in understanding the mechanics of pdfroff's processing
that you need help, so that you can reproduce the effect with your
existing work flow?
> My current cross-reference generation consists of a macro XRT to
> define a target based on the text and page of a heading immediately
> preceding:
>
> .de XRT
> .if \\n[TocGen] \{\
> . tm XREF: xref:\\$1:txt \\*[xref:HDtxt]
> . tm XREF: xref:\\$1:pg \\*[xref:HDpg]
> .\}
> .ie '\\*[.T]'html' .TAG \\$1
> .el .pdfhref M -N \\$1
> ..
>
> The strings "xref:HDtxt" and "xref:HDpg" are defined by the heading
> macro. The argument to XRT was essentially a named destination tag to
> begin with.
This will place a PDF marker at the destination point for any number
of pdfhref links; it doesn't deal with the creation of any such link,
which would require use of `.pdfhref L ...', rather than the use of
`.pdfhref M ...', as we see here.
> Here's my big problem: the aux-file is full of entries like:
>
> grohtml-info:page 170 353411 517000 355611 530398 540000 1 1
> ./somefile.ms
This is the standard format of the stderr output generated by groff's
`\O2' escape.
> And I don't see anywhere in pdfmark.tmac where it outputs this
> particular string. I'm guessing this is a hotspot definition, but the
> string "grohtml-info" doesn't appear anywhere in pdfmark.tmac.
It comes from pdfmark.tmac's pdf*href.mark.end macro, (intended for
internal use only), which is invoked twice, (first time indirectly by
pdf*href.mark.begin), for each invocation of `.pdfhref L ...', (or of
`.pdfhref W ...'), to capture the output page co-ordinates for the
beginning and end of each link hot-spot region.
> I've looked at the pdfroff script but it's not helping much.
The relevant fragment (folded for e-mail display) looks like:
# We now extend the local copy of the reference dictionary file,
# to create a full 'pdfmark' reference map for the document ...
#
$AWK '/^grohtml-info/ {print ".pdfhref Z", $2, $3, $4}' \
$WRKFILE >> $REFCOPY
Also relevant, and performed earlier, (during the initial multi-pass
processing phase, which is required to identify the placement of any
reference marks set by `.pdfhref M ...'), may be:
# Run 'groff' and 'awk', to identify reference marks in the document
# source, filtering them into the reference dictionary; discard
# incomplete 'groff' output at this stage.
#
eval $STREAM $GROFF_STYLE -Z 1>$NULLDEV 2>$WRKFILE \
$REFCOPY $INPUT_FILES
$AWK '/^gropdf-info:href/ {$1 = ".pdfhref D -N"; print}' \
$WRKFILE > $REFFILE
where $STREAM represents a hack to reproduce any stdin input piped to
pdfroff itself as stdin input to groff, in each and every processing
pass, while $GROFF_STYLE represents `groff -Tps` followed by any of
groff's own options which are specified on the pdfroff command line.
> What I need is a more generalized setup and some output I understand,
> then I could perhaps pipe it back in and go from there.
$GROFF_SOURCES/contrib/pdfmark/pdfmark.ms, (the source for the existing
incomplete documentation), is an example of document source suitable
for processing by pdfroff with ms macros, (wrapped by spdf.tmac). The
salient aspects of the processing mechanics are:
1) Multiple pre-processing passes are required, using the second
command sequence indicated above, to locate any PDF reference
marks; $WRKFILE captures groff's stderr output in each pass.
2) At the outset, $REFCOPY represents an empty file.
3) At the end of each pre-processing pass, PDF reference data is
filtered out of $WRKFILE, and transformed into `.pdfhref D ...'
requests in $REFFILE.
4) Each time $REFFILE is regenerated, its content is compared with
that of $REFCOPY, as it is at the start of the current cycle;
if the two are identical, the pre-processing cycle terminates,
otherwise...
5) $REFCOPY is replaced by the content of $REFFILE, and the cycle
is repeated, to regenerate $REFFILE once again.
6) After the pre-processing cycle terminates, $REFFILE and $REFCOPY
should represent identical files; (if not, then references have
not been satisfactorily resolved to stable locations, within the
maximum cycle count limit imposed by pdfroff). At this point,
$REFCOPY is augmented by filtering the link hot-spot reference
data from the last generated $WRKFILE, transforming it using the
first command noted above, and appending the resultant mapping
data as `.pdfhref Z ...' records, (two per hot-spot), to the
final content of $REFCOPY.
7) The document sources, with this final generation of $REFCOPY
included as the first input file, are processed through groff
to produce PostScript intermediate output, which is filtered
through GhostScript, to create the final PDF output.
Hopefully, the above will give you enough to get you going; just one
word of warning: don't add `.pdfhref Z ...' records (manually) to your
document sources -- the presence of just one such record will disable
the use of `\O2' in `.pdfhref L ...' and `.pdfhref W ...' requests,
making it virtually impossible to generate a hot-spot map.
--
Regards,
Keith.