[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Groff] UTP project
From: |
Steve Izma |
Subject: |
[Groff] UTP project |
Date: |
Mon, 03 Jun 2002 12:26:30 -0400 |
I have a copy of OCRShop for 30-day trial and have tried UTP chapter
14 with it, and I'm getting some odd results. I have been
scanning a book for work (prose, straight text) and OCRShop works
extremely well in creating plain text files (it will also produce
other formats that I haven't tested much). However, with the UTP
book, chapter 14, certain regular paragraphs of text get broken
up into very short (3-6 characters) lines. I've used the pbm
files from O'Reilly and scanned the same pages myself (I have a
copy of the original Hayden book) with the same results. I will
pursue this on chapter 14, but unless someone recognizes this
problem, I'll need to spend some amount of time searching through
the OCRShop docs for a clue. A couple of perusals haven't come up
with much.
One way or another, though, I'll do chapter 14.
On the topic of a macro package for the book, does anyone think
it worthwhile to tag the text as XML? I have various python
scripts for going from XML (or strict SGML) to groff requests and
if we're going to consider reworking a macro package, we would
just need to think in terms of XML structure when designing the
macros (i.e., we would mostly need to write starting and ending macros
for elements).
-- Steve
--
Steve Izma, (519) 884-0710 ext. 6125
Wilfrid Laurier University Press FAX: (519) 725-1399
Waterloo, Ont., Canada N2L 3C5 address@hidden