Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
03-01-2007, 03:10 PM
|
#1
|
|
Member
Registered: Mar 2005
Posts: 81
Rep:
|
How to extract Text from RTF files (or even DOC)
I regularly get files in very broken RTF format that I need to do a quick and dirty conversion to HTML. My current process is:
1) Open file in Open Office. Mark all, Copy to clipboard.
2) Open new, blank file in vi. Paste text into file.
3) Run sed script against file to do some quick and dirty formatting.
Step three runs great, I am quite happy with it. I would like to eliminate or simplify steps one and two.
Is there a way to extract all the text from a document from a command line that will work in a similar fashion to what I describe.
Thanks,
Skip
|
|
|
|
03-01-2007, 05:33 PM
|
#2
|
|
HCL Maintainer
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450
Rep:
|
Have you taken a look at UnRTF or rtf2xml?
|
|
|
|
03-01-2007, 07:36 PM
|
#3
|
|
LQ Newbie
Registered: Mar 2007
Posts: 5
Rep:
|
I agree with #2! You can use some tools!
|
|
|
|
03-02-2007, 07:37 AM
|
#4
|
|
Member
Registered: Mar 2005
Posts: 81
Original Poster
Rep:
|
I will try those, but I was kind of hoping for a more general function, perhaps something in sed that could simply strip the text from an rtf file.
|
|
|
|
03-02-2007, 12:43 PM
|
#5
|
|
Member
Registered: Jan 2006
Location: Chennai, Tamil nadu, India
Distribution: Debian Etch Testing
Posts: 117
Rep:
|
|
|
|
|
03-02-2007, 12:57 PM
|
#6
|
|
Member
Registered: Mar 2005
Posts: 81
Original Poster
Rep:
|
Thanks! Those look very useful.
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 04:45 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|