[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Formatting algorithm, an experiment
From: |
Ulrich Lauther |
Subject: |
Re: [Groff] Formatting algorithm, an experiment |
Date: |
Wed, 28 May 2014 13:49:58 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Mon, Mar 31, 2014 at 07:44:07PM -0400, Peter Schaffter wrote:
> Here's the bare bones version of the algorithm I was thinking of
> when I proposed improving line formatting by getting groff to
> shoulder the burden for some of the work we do manually. It's
> written out in brute-force pseudo-code; should be pretty clear.
>
To get a feeling for formating alternatives I implemented
a simplified version of the Knuth-Plass algorithm.
The idea is, to pack charakters as dense as possible
and to minimize the maximum gap at the end of any line.
In a second pass intra- and interword spacing could then be adjusted
along the lines of the heuristic proposed by Peter Schaffter.
To make tests realistic I implemented a primitive hyphenation algorithm
that sometimes works :-)
Character widths were taken from /usr/share/groff/1.21/font/devps/HR
For speed up, I first use a greedy algorithm and employ its result
as an upper bound for any gap in the dynamic programming solution.
However, the speed-up achieved that way seems to be minimal.
I tested with an English paragraph containing 775 words (before
hyphenation): "In olden times, when wishing still did some good,
there lived a king whose daughters were all beautiful, ...".
This is a stand-alone program, just for a feasibility test.
It reads a text (which is considered one paragraph), and runs the
formatter.
Results:
--------
Lines of code 370 plus 184 for my linked list class.
The core algorithm takes just 95 lines of code.
Break-objects generated: 823 or 0.9 per hyphenated syllable.
After formatting, these objects go into a list of garbage and can
be reused for the next paragraph.
Running time under Ubuntu on Intel(R) Atom(TM) CPU N455 @ 1.66GHz:
Less than one millisecond when compiled with g++ -O4.
I used a line length of 100 times the width of the space-character.
The maximum gap at the end of any lines was:
greedy: 10.6 spaces
KP: 7.1 spaces
I think, the experiment shows that running time is no issue at all.
Kind regards,
ulrich
- Re: [Groff] Formatting algorithm, an experiment,
Ulrich Lauther <=