LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-01-2009, 11:08 AM   #1
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Rep: Reputation: 58
Compiling tesseract-2.03: error: ‘INT32’ was not declared in this scope


I tried to compile tesseract on SuSE 10.1 (./configure && make && make install), but I got fatal errors:

Code:
make[3]: Entering directory `/usr/src/tesseract-2.03/dict'
g++ -DHAVE_CONFIG_H -I. -I. -I..  -I../cutil -I../ccutil -I/usr/local/include/liblept  -g -O2 -c dawg.cpp
dawg.cpp: In function ‘EDGE_RECORD* read_squished_dawg(const char*)’:
dawg.cpp:313: error: ‘INT32’ was not declared in this scope
dawg.cpp:324: error: return-statement with no value, in function returning ‘EDGE_RECORD*’
make[3]: *** [dawg.o] Error 1
make[3]: Leaving directory `/usr/src/tesseract-2.03/dict'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/usr/src/tesseract-2.03/dict'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/src/tesseract-2.03'
make: *** [all-recursive-am] Error 2
"-I." is repeated on the g++ line, so I suspected that some library path may be missing, and I tried this, but with no success:
Code:
export LD_LIBRARY_PATH=/lib:/usr/lib:/usr/include:/usr/local/include:/usr/local/include:/usr/local/include/liblept
Why tesseract 2.03 does not compile on SuSE? (Tesseract 2.01 did, but I want to upgrade).

Edit:
Also tried to insert this into dawg.h, with no success:
Code:
#include <stdlib.h>

Last edited by J_Szucs; 03-01-2009 at 11:39 AM.
 
Old 03-01-2009, 01:36 PM   #2
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
'tesseract-2.03' compiles with no errors on PCLinuxOS2007,
g++-4.1, and on Suse 10.3, g++-4.2 .

I used this source, mainly to see the patching :
http://packages.debian.org/lenny/tesseract-ocr
> > Links for tesseract-ocr >
[tesseract_2.03.orig.tar.gz], [tesseract_2.03-2.diff.gz],
but I didn't use the patch. ( No patching for dawg.cpp, it seems.)

Why not try again, from scratch, in /home/<username>/,
with a clean 'tesseract-2.03' ?
.....
( Libs used : zlib-devel, libjpeg-devel, libpng-devel )
.....

Last edited by knudfl; 03-01-2009 at 01:38 PM.
 
Old 03-01-2009, 01:46 PM   #3
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Original Poster
Rep: Reputation: 58
This is about my 10th compile from scratch of tesseract-2.03
The reason why I had 2.01 before is that I could not compile 2.03 some month ago.

Patches should not be the culprit for two reasons:
- it did not comply without any patches, either
- I did not apply any patches to dawg.cpp

Now I tried to install everything into /usr (--prefix=/usr), and add CPPFLAGS, but with no success:
Code:
make[3]: Entering directory `/usr/src/tesseract-2.03/dict'
g++ -DHAVE_CONFIG_H -I. -I. -I..  -I../cutil -I../ccutil -I/lib -I/usr/lib -I/usr/local/lib -I/usr/include -I/usr/local/include -I/usr/include/liblept  -g -O2 -c dawg.cpp
dawg.cpp: In function ‘EDGE_RECORD* read_squished_dawg(const char*)’:
dawg.cpp:313: error: ‘INT32’ was not declared in this scope
dawg.cpp:324: error: return-statement with no value, in function returning ‘EDGE_RECORD*’
make[3]: *** [dawg.o] Error 1
Missing INT32 should be something really trivial error.

Last edited by J_Szucs; 03-01-2009 at 02:09 PM.
 
Old 03-01-2009, 02:14 PM   #4
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
Suse 10.3 : No problem including 'liblept' , but then again,
it isn't used in 'tesseract-2.03/dict' ?

( Used in 'tesseract-2.03/pageseg', 'tesseract-2.03/ccmain' )

Could it just be a particular behavior
from g++-4.1.0, that causes the error ?
.....

( I did include 'libtiff-devel' too )
.....

Last edited by knudfl; 03-01-2009 at 02:25 PM.
 
Old 03-01-2009, 03:07 PM   #5
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Original Poster
Rep: Reputation: 58
You were right, every single patch I applied, including the one recommended on the Release Notes page of Tesseract to replace unicharset_extractor.cpp) was the culprit of a compile error.

So now - after removing the patches - it compiles successfully, but - for the lack of the patches - it has all the bugs that made it e.g. to segfault when generating dawg files (wordlist2dawg segmentation fault when processing a wordlist during training a new language).

So, I am back to the beginning of this day
Code:
Building DAWG from word list in file, 'magyar.gyakori.txt'
Compacting the DAWG
Szegmens hiba
No way to get closer to Hungarian language support

Last edited by J_Szucs; 03-01-2009 at 03:21 PM.
 
Old 03-02-2009, 07:42 AM   #6
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
Well, have a look at the "language patch" [tesseract_2.03-2.diff.gz],
which I mentioned in post #2.

No harm done trying it.

Gunzip the patch and place it outside 'tesseract-2.03/'
and do 'patch -p0 < tesseract_2.03-2.diff'
.....
 
Old 03-02-2009, 01:21 PM   #7
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Original Poster
Rep: Reputation: 58
Looks like I am too lame to do this.

The patch resided in /usr/src, and the source under tesseract-2.03. I run the command in /usr/src:
patch -p0 < tesseract_2.03-2.diff

and here is the only result:
Code:
patching file tesseract-2.03/debian/patches/ocropus
patching file tesseract-2.03/debian/patches/series
patching file tesseract-2.03/debian/patches/java
patching file tesseract-2.03/debian/patches/TESSDATA_PREFIX
patching file tesseract-2.03/debian/patches/gcc-4.3
patching file tesseract-2.03/debian/manpages
patching file tesseract-2.03/debian/copyright
patching file tesseract-2.03/debian/changelog
patching file tesseract-2.03/debian/rules
patching file tesseract-2.03/debian/compat
patching file tesseract-2.03/debian/tesseract-ocr.install
patching file tesseract-2.03/debian/tesseract.1
patching file tesseract-2.03/debian/control
patching file tesseract-2.03/debian/wordlist2dawg.1
patching file tesseract-2.03/debian/cntraining.1
patching file tesseract-2.03/debian/mftraining.1
patching file tesseract-2.03/debian/tesseract-ocr-dev.install
patching file tesseract-2.03/debian/unicharset_extractor.1
patching file tesseract-2.03/debian/docs

So, the patch only creates a debian sub-directory under the tesseract-2.03 directory, putting there some files, too, but no other files are patched in the source tree. I checked that with diff. And the "segmentation error" is still there of course

I also tried to apply the patch to the sources in tesseract_2.03.orig.tar.gz downloaded from packages.debian.org, but with no success.

So, how to apply the debian patches?

Last edited by J_Szucs; 03-02-2009 at 01:26 PM.
 
Old 03-03-2009, 12:14 AM   #8
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
To use the patches in tesseract-2.03/debian/patches :

I just copied the files outside the top directory, tesseract-2.03/
and patched with 'patch -p0 < <patch>'
 
Old 03-03-2009, 01:28 AM   #9
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Original Poster
Rep: Reputation: 58
I did the same, and I just cannot imagine why the .cpp files mentioned in the .diff file remain unpatched.
When you run the patch, is a debian directory under tesseract-03 created?

If so, then I apply the patch at the right place.

Anyway, I think the patch does not concern dawg file creation, so it would be of no use for me.

Some Polish guys reported the same problem as me, and the developers answered that the bug will be addressed in the 3.x versions of tesseract. Since there is still no Polish support, I suppose they are waiting for that version, too.

I am really curious, how the most recent language support files were created (fra, deu, etc.), given the fact that this function of tesseract is broken. Did they do it with the Windows version? It would be a shame. Or with an older version of tesseract? Anyway, it is a shame to let this bug preventing to create new language support files unpatched until the 3.x versions.

If that function is really broken, I think the users should deserve a warning to avoid wasting their days on creation of training files that cannot be used at the end.

But what can we expect from developers, who issue a release that simply does not work? (The 2.02 version was such.) OCR support in linux is really slowly advancing, and I do not see any big change - compared to other areas - with the emerge of tesseract, either

Last edited by J_Szucs; 03-03-2009 at 09:56 AM.
 
Old 03-03-2009, 07:39 AM   #10
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
1) Post #8, line 1, see that I have tesseract-2.03/debian/

2) The "language patch", I saw, when reading tesseract_2.03-2.diff
was for a 'man' page.

3) There is no patch for the 'dawg' files.
 
Old 03-05-2009, 02:35 PM   #11
J_Szucs
Senior Member
 
Registered: Nov 2001
Location: Budapest, Hungary
Distribution: SuSE 6.4-11.3, Dsl linux, FreeBSD 4.3-6.2, Mandrake 8.2, Redhat, UHU, Debian Etch
Posts: 1,126

Original Poster
Rep: Reputation: 58
The "segmentation error" is corrected in today's svn sources (pre-2.04).

I could finally create the hun language files. Though the result is terrible, so I think I should tweak it before use.

Edit:
I succeeded with creation of the Hungarian language support of tesseract, and the result seems to be excellent. Thanks for your help!

Last edited by J_Szucs; 03-09-2009 at 04:16 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Errors while compiling Faust: "PATH_MAX was not declared in this scope" prasadbrg Linux - Software 6 12-15-2008 01:04 AM
/ArgParser.cpp:207: error: 'atoi' was not declared in this scope Jane2008 Linux - Newbie 2 11-07-2008 09:23 AM
FYI: dvd+rw-tools-7.1 slackware 12.1 compile error 'INT_MAX' was not declared coldbeer Slackware 4 09-15-2008 02:45 PM
Error Prompt - 'Execute_command was not declared in this scope. Please help! thhuang Linux - Newbie 1 11-05-2007 12:38 AM
ltoa not declared in the scope lucky6969b Programming 3 12-23-2005 08:44 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:47 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration