LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 06-22-2009, 11:23 AM   #1
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
Has anybody managed to get strigi to index pdf files?


I have been playing around with strigi, the file indexing program, but have not been able to index .pdf files. From my research it seems that this should be possible, as strigi uses pdftotext to convert .pdf files to text for indexing. However, even for .pdf files that I have created from simple text files, indexing has not been successful.
This has been tried in slackware-current and slackware64 using both the supplied package of strigi-0.6.4 and the latest strigi-0.6.5. (In slackware64 I have added a symlink 'ln -s /usr/lib64/gcj-4.3.3-9/libjvm.so /usr/lib64/libjvm.so' to enable java and then edited ~/.kde/share/config/nepomukserverrc to change the soprano backend to sesame2 to be the same as in slackware-current).
If I try 'xmlindexer <some_pdf_file>' I do not see the expected text output and no .pdf files are shown when 'strigiclient' is used to search for a known text string.
Yet strigi is happily indexing openoffice .odt files and MS Word .doc files.
I am curious as to whether anybody has this working, and if so, were any tweaks required?
 
Old 06-24-2009, 01:59 PM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,289

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
That will be using some library to read pdfs. If that's missing on the system, compile won't fail, but pdf functionality will be disabled.
recompile from source, read the docs, and you will find which libs to install. try this:

which strigli and this will give you the exact path. Then
ldd /path/to/strigli shows you the libraries it uses.
ldd /path/to/strigli |grep found shows you the missing ones
 
Old 06-25-2009, 09:24 AM   #3
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Original Poster
Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
Thanks for the reply. I have checked with ldd and have not found any missing libraries for strigiclient.
This is not a showstopper for me, but it could be a very useful tool. Perhaps when Sebastian Trüg has finished fixing k3b for KDE4, then this will advance a little further.
 
Old 06-25-2009, 10:21 AM   #4
piete
Member
 
Registered: Apr 2005
Location: Havant, Hampshire, UK
Distribution: Slamd64, Slackware, PS2Linux
Posts: 465

Rep: Reputation: 44
ldd will only show what the binary was linked against, not what it *could* have linked against. If there's a compile option to include some other libraries, that'll show up in the install or readme files of the source (I'd hope). I don't have everything to hand to check it myself right now, or I would Maybe later on.

- Piete.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
when finding files in Konqueror, why use the files index? newbiesforever Linux - General 4 03-20-2009 09:06 AM
managed to make files owned by 1016 and cannot delete, i'm using slackware 12 The_spacekadet Linux - Software 2 02-25-2008 04:01 AM
LXer: Index and search with KDE's new Strigi LXer Syndicated Linux News 0 02-11-2008 08:00 PM
How do I unpack pdf.pdf files corbis_demon Linux - General 5 10-29-2004 09:12 PM
gnucash network files not managed leipper Linux - Software 0 09-19-2003 01:40 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 04:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration