LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 01-27-2010, 10:33 AM   #1
yermandu
LQ Newbie
 
Registered: Nov 2009
Distribution: Gentoo amd64
Posts: 4

Rep: Reputation: 0
Post OCR in Linux, unsatisfactory results


Hellow Guys,

I have tested several software to use the OCR with my HP printer. Unfortunately the software that comes with it is only available for Mac OS and Windows. As I said I installed several software without success.
In my search I found that the Tesseract is better OCR application for Linux.

However I found two problems:
  1. He does not have a GUI, a graphical interface, but it is possible to be done by commands, which is very boring when you want to scan several pages.
  2. The results were very unsatisfactory, at least in my language "Portuguese", in a text with 1000 words he recognized only two or three which is very little, i text with several letters, magazines, books and folders.


So, i want help to install and use some OCR on Linux.

I have a HP Photosmart C4480.
 
Old 01-27-2010, 01:37 PM   #2
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
Well, there exist many GUI programs that do OCR, however I'm uncertain about their state of development, they may be alpha:
http://freshmeat.net/search?q=ocr&submit=Search
http://sourceforge.net/search/?type_...soft&words=ocr

I usually use ocrad and it actually produces decent results at least for English, but you should probably apply some image filters, and maybe use something like unpaper as well before you run ocrad on it, or the output will not be as good. The better the input image, the better the OCR translation.

It's true I haven't seen a truly professional OCR for Linux, but try some out, maybe there is one out there that you might find acceptable.

Last edited by H_TeXMeX_H; 01-27-2010 at 01:38 PM.
 
Old 01-27-2010, 01:38 PM   #3
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,634

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by yermandu View Post
Hellow Guys,

I have tested several software to use the OCR with my HP printer. Unfortunately the software that comes with it is only available for Mac OS and Windows. As I said I installed several software without success.
In my search I found that the Tesseract is better OCR application for Linux.

However I found two problems:
  1. He does not have a GUI, a graphical interface, but it is possible to be done by commands, which is very boring when you want to scan several pages.
  2. The results were very unsatisfactory, at least in my language "Portuguese", in a text with 1000 words he recognized only two or three which is very little, i text with several letters, magazines, books and folders.

So, i want help to install and use some OCR on Linux.
You can install GOCR, as it has a GUI, but Tesseract is much more accurate, providing you use it correctly. A quick Google search turns up:

http://www.linux.com/archive/feature/138511

which has examples, instructions, and basic shell scripts to 'automate' OCR of a bunch of pages. Note that if you don't load the right language (German, English, etc.), accuracy is always going to be bad.
 
  


Reply

Tags
ocr



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
OCR Application for Linux-Suggestions? BobNutfield Linux - Software 5 05-10-2008 03:17 PM
LXer: Linux reference kit targets portable devices with OCR, media ... LXer Syndicated Linux News 0 07-06-2006 11:33 AM
OCR and Scanner prog for Linux Durham Linux - Software 1 12-09-2005 06:16 AM
OCR Program for Linux RGummi Linux - Software 3 11-11-2005 05:18 PM
OCR initialization failed accessing OCR device: PROC-26 cheeku Linux - Software 0 09-19-2004 08:36 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 01:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration