Visit Jeremy's Blog.
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 06-25-2003, 01:50 PM   #1
LQ Newbie
Registered: Jun 2003
Distribution: RedHat
Posts: 2

Rep: Reputation: 0
PDF to Text Conversion

I am currently working on an open source project called GOOP. GOOP chews up documents and other text forms in order to create metadata. This metadata is compared and shared over a P2P network (via JXTA) and the GOOP application automatically performs data comparisons (via the metadata) with nodes that it encounters. PDF files present a challenge because I need some way to convert them to text so an access them with GOOP.

What I need: An open source script or binary to convert PDF files to text. I would love it if it was in Java but at this point I'll take anything.

Many thanks to any and all who can help.....


**Reposted from Linux Software section***
Old 06-25-2003, 02:05 PM   #2
Senior Member
Registered: Apr 2001
Location: Perry, Iowa
Distribution: Mepis , Debian
Posts: 2,692

Rep: Reputation: 45 it includes a utility to extract .pdf to .txt
Old 06-25-2003, 02:06 PM   #3
Senior Member
Registered: Apr 2001
Location: Perry, Iowa
Distribution: Mepis , Debian
Posts: 2,692

Rep: Reputation: 45
never used it, your mileage may vary
Old 06-25-2003, 04:40 PM   #4
Registered: Sep 2002
Location: Canada
Distribution: Redhat 9.0
Posts: 637

Rep: Reputation: 30
As far as I know, PDF is not a format that should be convertable to text, because that would allow people to tamper with the original document. You can publish a PDF or PS document with OOo Writer, but thankfully you can't do anything about editing an existing PDF.
Old 06-26-2003, 03:24 PM   #5
Registered: Apr 2002
Location: D.C - USA
Distribution: slackware-current
Posts: 488

Rep: Reputation: 30
... unless you have the adobe writer ... even then poorly made PDFs (ie - scanned) won't be editable.
Old 01-03-2012, 08:22 AM   #6
LQ Newbie
Registered: May 2006
Location: iowa, us
Distribution: Ubuntu, Red Hat
Posts: 16

Rep: Reputation: 0
solution here - pdftotext has a walk-through on using pdftotext on linux.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
looking for fast pdf conversion tool James_dean Linux - Software 1 10-13-2005 07:04 AM
command line text to pdf conversion waskelton4 Linux - Software 4 02-14-2005 05:38 PM
pdf to doc conversion hoffmanyew Linux - Software 2 10-28-2004 09:50 PM
Convertible... Converts.... No Wait - I know! Conversion Utilites (TXT/PDF) Nimoy Linux - Software 4 08-30-2003 01:41 PM
PDF to Text Conversion limnephilidae Linux - Software 2 06-25-2003 12:36 PM > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:58 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration