Has anybody managed to get strigi to index pdf files?
I have been playing around with strigi, the file indexing program, but have not been able to index .pdf files. From my research it seems that this should be possible, as strigi uses pdftotext to convert .pdf files to text for indexing. However, even for .pdf files that I have created from simple text files, indexing has not been successful.
This has been tried in slackware-current and slackware64 using both the supplied package of strigi-0.6.4 and the latest strigi-0.6.5. (In slackware64 I have added a symlink 'ln -s /usr/lib64/gcj-4.3.3-9/libjvm.so /usr/lib64/libjvm.so' to enable java and then edited ~/.kde/share/config/nepomukserverrc to change the soprano backend to sesame2 to be the same as in slackware-current).
If I try 'xmlindexer <some_pdf_file>' I do not see the expected text output and no .pdf files are shown when 'strigiclient' is used to search for a known text string.
Yet strigi is happily indexing openoffice .odt files and MS Word .doc files.
I am curious as to whether anybody has this working, and if so, were any tweaks required?