bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-ocrad] Adding new language


From: Dmitry Vereschaka
Subject: Re: [Bug-ocrad] Adding new language
Date: Wed, 19 Oct 2005 01:20:30 +0400 (MSD)


On Tue, 18 Oct 2005, Antonio Diaz Diaz wrote:

> > Does 'Adding new language HOWTO' exists for ocrad?
>
> No, but I am beginning to feel the pressure to write one. :)

;)

> > I wish to add new language support, but don't now how. I have some C/C++ 
> > experience, but have no
> > idea about OCR tricks :(
>
> Your help is very welcome, and it is about time to add support for other
> languages to ocrad. But I warn you it won't be easy. Ocrad is tuned for
> ASCII and some minor variations like ISO-8859-15 (western) and
> ISO-8859-9 (Turkish).
>
> Adding support for Cyrillic (I suppose this is what you want) will be a
> formidable task (years), and I only can help you with the
> infrastructure, because I have no idea about Cyrillic.

Some of cyrillic letters looks just like latin letters - they are
(uppercase/lowercase) e/E, x/X, a/A, o/O, p/P, c/C, small B/B, y/big y,
small H/H, small T or m/T, small M/M, small K/K, r/big r, small b/b),
so I think that it is possible to derive cyrillic language from ASCII class.

Two letters looks like mirror images of latin letters - N,R

One letter looks like number 3, another one - like mirrored N with short
line above, yet another one - like e/E with two dots above, and yet
another one - like b| (b followed by vertical line).

But 12 more letters still need to be defined (less or more) different
from latin characters...

Is there way to learn ocrad for new letters by giving it images of these
letters?

Or is there tool which helps with building 'guess for' sequences/rules on
given image?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]