groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] groff performance in respect to hardware platform


From: Ralph Corderoy
Subject: Re: [Groff] groff performance in respect to hardware platform
Date: Fri, 25 Mar 2016 11:04:55 +0000

Hi Steve,

> Yes, I need to look more closely at this. My pipeline consists of:
> - a python script reading xml files one at a time, parsing, and
>   doing a fairly simple substitution of xml tags to groff
>   requests (although more complicated than what one can do with
>   sed)
> - groff, which calls my own set of tmac files, but which is, of
>   course, a pipeline of its own.

XML sounds slow.  And Python's not the fastest kid on the block.  I'd
start by watching the machine's profile, e.g. dstat, whilst your
pipeline is running.  You can then run parts individually using
temporary files to confirm your suspects.  And GNU's `/bin/time -v', not
the shell built-in, can help in summarising a command's usage.

> The output is a PostScript file.
...
> I start a separate instance of okular to view the PostScript file. I
> don't think that okular is particularly tuned to PDF to the point that
> a PS file causes it more work; it might be the reverse.

That may be the case for Okular, but PDF is an indexed file format
compared to the linear, more general, format of PostScript.  Thus,
plucking page N for display can take more work in the PS case.  And PS
is a programming language, PDF is a data structure, give or take, so
processing can be quicker.

It probably won't be to your taste, but try mupdf(1) to see how snappily
a PDF can be rendered compared to Okular.  Unfortunately, it doesn't
watch the file, needing an `r' to reload.  A shame, as a SIGUSR1 would
do.

> But if I'm viewing, say, page 90 the PS file (being written apparently
> in chunks by grops) is noticed by okular as having its timestamp
> changed, so it reads whatever in can get, can't find page 90, so
> displays page 50. Strangely enough, it doesn't seem to notice that the
> file has had more pages added to it after this point, so I'm stuck
> looking at the wrong place in the output.

See what the modification-time granularity is of the filesystem where
this is taking place.

    while sleep 0.07; do >foo; ls --full-time foo; done | uniq -c

The proper solution is to produce a foo.ps.new and atomically mv(1) it
to foo.ps.

> I think you're correct about the limitation caused by the serial
> nature of typesetting

And within that there's probably one bit that's the bottleneck and that
would still govern the overall time.

troff(1), and groff, have a -o option to only output the given pages.
That might help if you know you're currently working in -o50-100 of a
much bigger document.

Cheers, Ralph.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]