bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ocrad] ocrad usage


From: Tilman Hausherr
Subject: Re: [Bug-ocrad] ocrad usage
Date: Thu, 4 Feb 2016 18:23:54 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

[2nd try, I accidentally sent this to Chris only]

I tested it a few years ago (the application I built still run fine) with 200dpi and 300dpi b/w files (not gray, not color). The results are OK but not great, but not as bad as you describe. I use it for an application where quality is not important, but speed is. I don't want to tell what I do with OCRAD, but I can think of other use cases, e.g. detect the language of a document.

I'd recommend that you scan directly into b/w files. Never never compress such files into JPEG, this will produce artefacts.

If you need higher quality, try tesseract. But it will be much slower.

Tilman

PS Please correct your date. Unless you really sent this mail in 2009.

Am 25.07.2009 um 22:15 schrieb Chris:
Dear bugtrackers and developers,



I am trying to use the gnu ocr OCRAD to extract text from scanned
documents. Reviews of the software deem it to be "reasonably good" and
to "produce fairly accurate results". Unfortunately, when I use OCRAD
to parse images, I do not even get any barely usable results. The
output of OCRAD looks more like a dumped gpg encrypted file then a
document - I'm serious, not even remotely readable.



I have tried everything I could think of. I printed the "quick brown
fox jumps over the lazy dog" in Arial and New Times Roman, size ranging
from 9 to 16 on an A4 paper and scanned it in color, grayscale and
black-and-white, with 72, 300, 750 and 1200 dpi. The 12 scanned images
each got saved as pbm and ppm. that makes 24 files and not even one was
processed by ocrad to produce remotely readable results. The best
approximation was "qa\;c_br0mfox ipmpsO wer the |psYdOq", by processing
the 750 dpi grayscale pbm...



Obviously, I'm doing something wrong here, but I don't know what. I am
using kooka to scan the images from a HP Deskjet 4620F. Ocrad is
version 0.17, running on SUSE 11.1 .



If you could hint me to what I am doing wrong here, please do...





thanks for your help in advance


_______________________________________________
Bug-ocrad mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-ocrad




reply via email to

[Prev in Thread] Current Thread [Next in Thread]