LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   OCR software (https://www.linuxquestions.org/questions/linux-software-2/ocr-software-4175582281/)

biosboy4 06-14-2016 04:35 PM

OCR software
 
Hello,

I have found myself in a situation where I'm going to need to pick up a print job from "the wire" and use OCR software to pull data (numbers to be exact). This is essentially so we can get digital data on the products that the machines built.

where might I start to learn how to hi-jack a print job?

Thanks,

michaelk 06-14-2016 06:05 PM

The typical case for OCR software is that you would scan a paper document, the scanner/software creates images in some format like TIFF. The OCR software then converts the image file from TIFF to some editable format like text.

https://help.ubuntu.com/community/OCR

You need to explain the following and what you mean by "the wire" and
Quote:

This is essentially so we can get digital data on the products that the machines built.

biosboy4 06-14-2016 06:14 PM

OCR software
 
that machine has a "print this page" button that we can utilize to send a print job of the screen showing the data. I'm looking into picking the print job up from the network and using ocr software to generate the data from the image.

michaelk 06-14-2016 06:48 PM

You could use tcpdump to capture and extract the "print job". You will need a ethernet hub versus switch.

The print job would be in the language used by the printer which could be PostScript, PCL, GDI etc. depending on the make/model of printer. You would then need to convert that into an image file like TIFF and finally run that through the OCR software.

jefro 06-14-2016 08:35 PM

Print to paper, find office/business system like XeroX system and have it scan to a pdf. The modern xerox systems will even be searchable.

I played with a number of free OCR in linux and they all failed.

If anything, I'd buy a windows program but I think it is still behind what xerox offers.

biosboy4 06-15-2016 06:28 AM

OCR software
 
this needs to happen automatically. manually scanning the documents to pdf is simply not a solution to our problem.

jefro 06-15-2016 05:58 PM

If you find a high quality OCR then let me know.

Maybe as I re-read your question the issue of OCR isn't a correct term.

Can you explain more about your issue? I get the feeling that all you want to do is capture characters and input them into some program. That isn't exactly OCR.


All times are GMT -5. The time now is 08:17 PM.