Quote:
Originally Posted by kaza
Hello!
I need to perform an OCR of many Hebrew pages of forms with both printed text and some hand written entries.
Currently I'm reading about "traininig" of "tesseract" but some docs mentioned it wasn't designed to deal with hand written text so maybe I'm wasting my time. Maybe there are other software for that task?
|
Nope.
OCR, even when dealing with a clear, typeset font, isn't 100% accurate. Even with clear handwriting (whatever the language), it drops drastically. There are some commercial OCR programs that offer a TON of features, and get you better accuracy, but you need to have deep pockets. Tesseract is a pretty good program...if the handwriting is consistent and not too messy, and you give it some good training, you'll get pretty good results, but need to set your expectations accordingly. Do quality control on the scanned pages, and it will at least get you a good way there.