Has anybody managed to get strigi to index pdf files?
SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Has anybody managed to get strigi to index pdf files?
I have been playing around with strigi, the file indexing program, but have not been able to index .pdf files. From my research it seems that this should be possible, as strigi uses pdftotext to convert .pdf files to text for indexing. However, even for .pdf files that I have created from simple text files, indexing has not been successful.
This has been tried in slackware-current and slackware64 using both the supplied package of strigi-0.6.4 and the latest strigi-0.6.5. (In slackware64 I have added a symlink 'ln -s /usr/lib64/gcj-4.3.3-9/libjvm.so /usr/lib64/libjvm.so' to enable java and then edited ~/.kde/share/config/nepomukserverrc to change the soprano backend to sesame2 to be the same as in slackware-current).
If I try 'xmlindexer <some_pdf_file>' I do not see the expected text output and no .pdf files are shown when 'strigiclient' is used to search for a known text string.
Yet strigi is happily indexing openoffice .odt files and MS Word .doc files.
I am curious as to whether anybody has this working, and if so, were any tweaks required?
That will be using some library to read pdfs. If that's missing on the system, compile won't fail, but pdf functionality will be disabled.
recompile from source, read the docs, and you will find which libs to install. try this:
which strigli and this will give you the exact path. Then
ldd /path/to/strigli shows you the libraries it uses.
ldd /path/to/strigli |grep found shows you the missing ones
Thanks for the reply. I have checked with ldd and have not found any missing libraries for strigiclient.
This is not a showstopper for me, but it could be a very useful tool. Perhaps when Sebastian Trüg has finished fixing k3b for KDE4, then this will advance a little further.
ldd will only show what the binary was linked against, not what it *could* have linked against. If there's a compile option to include some other libraries, that'll show up in the install or readme files of the source (I'd hope). I don't have everything to hand to check it myself right now, or I would Maybe later on.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.