Originally posted by limnephilidae
I am currently working on an open source project called GOOP. GOOP chews up documents and other text forms in order to create metadata. This metadata is compared and shared over a P2P network (via JXTA) and the GOOP application automatically performs data comparisons (via the metadata) with nodes that it encounters. PDF files present a challenge because I need some way to convert them to text so an access them with GOOP.
What I need: An open source script or binary to convert PDF files to text. I would love it if it was in Java but at this point I'll take anything.
Many thanks to any and all who can help.....
I am scratching deep in my brain now, but I think that a TeX distribution comes with a lot of pdf2xxx tools, so I am guessing that you should install that and give it a try.