Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
|
10-17-2014, 01:39 PM
|
#1
|
Member
Registered: Nov 2012
Location: London
Distribution: Mint 20, Kali, Peppermint, Ubuntu, MakuluFlash, Fedora 32, Windows 12 Lite, MakuluLinux
Posts: 821
Rep:
|
What software for scanning OCR
Hi,
I have Linux Mint 17 and had my PC stolen with all my valuable writings.
I have saved a lot of my writings on A4 sheets.
I want to scan in my pages and recreate my writings.
I am told that I can get Optical Character Software (OCR) to do this.
What if any software is available for scanning in my A4 pages?
|
|
|
10-17-2014, 03:34 PM
|
#2
|
Moderator
Registered: Mar 2008
Posts: 22,130
|
Yes and no. There are about 4 good OCR apps. One is commercial and may be best. The free ones are pretty crummy. I gave up and just used a business copy machine that seemed to be almost perfect.
tesseract was promised by google but seemed to stop.
ABBYY seems to be tested with best results.
A number of other apps around.
https://help.ubuntu.com/community/OCR
|
|
|
10-17-2014, 04:58 PM
|
#3
|
Member
Registered: Aug 2012
Posts: 74
Rep:
|
Try Tesseract. It should be able to read your files.
@jefro: Tesseract is still being developed. The last code change was on October 14, 2014.
Last edited by c0d3d; 10-17-2014 at 05:04 PM.
|
|
|
10-17-2014, 05:57 PM
|
#4
|
Senior Member
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982
|
Tesseract is the best FLOSS one. I recommend doing some preprocessing of the images before feeding them in, or use a program that does that. Try to get them black text on pure white background.
|
|
|
10-17-2014, 06:02 PM
|
#5
|
LQ Muse
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,639
|
I have used "Tesseract" in the past
for almost all normal fonts it has no problems
handwriting ?????
if it is hand wrote "block" text that will work well ( mostly)
|
|
|
10-18-2014, 04:28 PM
|
#6
|
Member
Registered: Nov 2012
Location: London
Distribution: Mint 20, Kali, Peppermint, Ubuntu, MakuluFlash, Fedora 32, Windows 12 Lite, MakuluLinux
Posts: 821
Original Poster
Rep:
|
How do I install Tesseract
Quote:
Originally Posted by John VV
I have used "Tesseract" in the past
for almost all normal fonts it has no problems
handwriting ?????
if it is hand wrote "block" text that will work well ( mostly)
|
It seems that Tesseract is popular I tried to download it using apt-get
but it does not find it when I try to start it as sudo apt-get install Tesseract.
Is this not available with apt-get?
|
|
|
10-18-2014, 05:29 PM
|
#7
|
Member
Registered: Aug 2012
Posts: 74
Rep:
|
@bscho: The command should be "sudo apt-get install tesseract-ocr"
|
|
|
10-18-2014, 05:31 PM
|
#8
|
LQ Muse
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,639
|
do not know about ubuntu / mint
but on suse
Code:
su -
zypper in tesseract
just do a search
" zypper se tesseract"
displays a TON of downloads including different languages
from
https://help.ubuntu.com/community/AptGet/Howto
Code:
apt-get update
apt-cache search tesseract
|
|
|
10-18-2014, 06:01 PM
|
#9
|
LQ Veteran
Registered: Jan 2011
Location: Abingdon, VA
Distribution: Catalina
Posts: 9,374
Rep:
|
Code:
sudo install tesseract-ocr
67 languaeges too
Code:
tesseract-ocr-afr - tesseract-ocr language files for Afrikaans
tesseract-ocr-ara - tesseract-ocr language files for Arabic
tesseract-ocr-aze - tesseract-ocr language files for Azerbaijani
tesseract-ocr-bel - tesseract-ocr language files for Belarusian
tesseract-ocr-ben - tesseract-ocr language files for Bengali
tesseract-ocr-bul - tesseract-ocr language files for Bulgarian
tesseract-ocr-cat - tesseract-ocr language files for Catalan
tesseract-ocr-ces - tesseract-ocr language files for Czech
tesseract-ocr-chi-sim - tesseract-ocr language files for Simplified Chinese
tesseract-ocr-chi-tra - tesseract-ocr language files for Traditional Chinese
tesseract-ocr-chr - tesseract-ocr language files for Cherokee
tesseract-ocr-dan - tesseract-ocr language files for Danish
tesseract-ocr-deu - tesseract-ocr language files for German
tesseract-ocr-deu-frak - tesseract-ocr language files for German Fraktur
tesseract-ocr-dev - transitional dummy package
tesseract-ocr-ell - tesseract-ocr language files for Greek
tesseract-ocr-eng - tesseract-ocr language files for English
tesseract-ocr-enm - tesseract-ocr language files for Middle English
tesseract-ocr-epo - tesseract-ocr language files for Esperanto
tesseract-ocr-equ - tesseract-ocr language files for equations
tesseract-ocr-est - tesseract-ocr language files for Estonian
tesseract-ocr-eus - tesseract-ocr language files for Basque
tesseract-ocr-fin - tesseract-ocr language files for Finnish
tesseract-ocr-fra - tesseract-ocr language files for French
tesseract-ocr-frk - tesseract-ocr language files for Frankish
tesseract-ocr-frm - tesseract-ocr language files for Middle French
tesseract-ocr-glg - tesseract-ocr language files for Galician
tesseract-ocr-grc - tesseract-ocr language files for ancient Greek
tesseract-ocr-heb - tesseract-ocr language files for Hebrew
tesseract-ocr-hin - tesseract-ocr language files for Hindi
tesseract-ocr-hrv - tesseract-ocr language files for Croatian
tesseract-ocr-hun - tesseract-ocr language files for Hungarian
tesseract-ocr-ind - tesseract-ocr language files for Indonesian
tesseract-ocr-isl - tesseract-ocr language files for Icelandic
tesseract-ocr-ita - tesseract-ocr language files for Italian
tesseract-ocr-ita-old - tesseract-ocr language files for Old Italian
tesseract-ocr-jpn - tesseract-ocr language files for Japanese
tesseract-ocr-kan - tesseract-ocr language files for Kannada
tesseract-ocr-kor - tesseract-ocr language files for Korean
tesseract-ocr-lav - tesseract-ocr language files for Latvian
tesseract-ocr-lit - tesseract-ocr language files for Lithuanian
tesseract-ocr-mal - tesseract-ocr language files for Malayalam
tesseract-ocr-mkd - tesseract-ocr language files for Macedonian
tesseract-ocr-mlt - tesseract-ocr language files for Maltese
tesseract-ocr-msa - tesseract-ocr language files for Malay
tesseract-ocr-nld - tesseract-ocr language files for Dutch
tesseract-ocr-nor - tesseract-ocr language files for Norwegian
tesseract-ocr-osd - tesseract-ocr language files for script and orientation
tesseract-ocr-pol - tesseract-ocr language files for Polish
tesseract-ocr-por - tesseract-ocr language files for Portuguese
tesseract-ocr-ron - tesseract-ocr language files for Romanain
tesseract-ocr-rus - tesseract-ocr language files for Russian
tesseract-ocr-slk - tesseract-ocr language files for Slovak
tesseract-ocr-slk-frak - tesseract-ocr language files for Slovak Fractur
tesseract-ocr-slv - tesseract-ocr language files for Slovenian
tesseract-ocr-spa - tesseract-ocr language files for Spanish
tesseract-ocr-spa-old - tesseract-ocr language files for Old Spanish
tesseract-ocr-sqi - tesseract-ocr language files for Albanian
tesseract-ocr-srp - tesseract-ocr language files for Serbian
tesseract-ocr-swa - tesseract-ocr language files for Swahili
tesseract-ocr-swe - tesseract-ocr language files for Swedish
tesseract-ocr-tam - tesseract-ocr language files for Tamil
tesseract-ocr-tel - tesseract-ocr language files for Telugu
tesseract-ocr-tgl - tesseract-ocr language files for Tagalog
tesseract-ocr-tha - tesseract-ocr language files for Thai
tesseract-ocr-tur - tesseract-ocr language files for Turkish
tesseract-ocr-ukr - tesseract-ocr language files for Ukranian
tesseract-ocr-vie - tesseract-ocr language files for Vietnamese
|
|
|
10-18-2014, 07:43 PM
|
#10
|
Member
Registered: Aug 2012
Posts: 74
Rep:
|
You probably only want the english language files for tesseract (tesseract-ocr-eng).
|
|
|
10-19-2014, 07:07 AM
|
#11
|
Member
Registered: Nov 2012
Location: London
Distribution: Mint 20, Kali, Peppermint, Ubuntu, MakuluFlash, Fedora 32, Windows 12 Lite, MakuluLinux
Posts: 821
Original Poster
Rep:
|
How do I install Tesseract
Quote:
Originally Posted by c0d3d
@bscho: The command should be "sudo apt-get install tesseract-ocr"
|
I have tried that and the terminal says:
tesseract-ocr is already the newest version
tesseract-ocr set to manually installed.
It recognizes the program but doesn't download.
Any suggestions?
|
|
|
10-19-2014, 01:02 PM
|
#12
|
Member
Registered: Aug 2012
Posts: 74
Rep:
|
It means that it is already downloaded. Are you sure you downloaded the English language files ("apt-get install tesseract-ocr-eng")?
Keep in mind that this software has no GUI. It is terminal-based only (you might be able to get a custom GUI for it from here if you really want one).
|
|
|
10-19-2014, 02:33 PM
|
#13
|
Member
Registered: Nov 2012
Location: London
Distribution: Mint 20, Kali, Peppermint, Ubuntu, MakuluFlash, Fedora 32, Windows 12 Lite, MakuluLinux
Posts: 821
Original Poster
Rep:
|
GUI
Quote:
Originally Posted by c0d3d
It means that it is already downloaded. Are you sure you downloaded the English language files ("apt-get install tesseract-ocr-eng")?
Keep in mind that this software has no GUI. It is terminal-based only (you might be able to get a custom GUI for it from here if you really want one).
|
No thanks I did not know it had no gui. You are right it is loaded and gives 10 command options.
I found that its output is in English.
Can you tell me if there is a help file for it somewhere?
|
|
|
10-19-2014, 02:46 PM
|
#14
|
Member
Registered: Aug 2012
Posts: 74
Rep:
|
"man tesseract-ocr" should give a help file.
|
|
|
10-19-2014, 03:41 PM
|
#15
|
Member
Registered: Nov 2012
Location: London
Distribution: Mint 20, Kali, Peppermint, Ubuntu, MakuluFlash, Fedora 32, Windows 12 Lite, MakuluLinux
Posts: 821
Original Poster
Rep:
|
ocr software
Quote:
Originally Posted by c0d3d
"man tesseract-ocr" should give a help file.
|
Thanks for your prompt reply. I have tried that and I get
No manual entry for tesseract-ocr.
Any other way of getting the manual?
|
|
|
All times are GMT -5. The time now is 02:21 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|