LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   PHP Script to parse Word/RTF Documents (https://www.linuxquestions.org/questions/programming-9/php-script-to-parse-word-rtf-documents-24427/)

saravanan1979 06-27-2002 06:54 AM

PHP Script to parse Word/RTF Documents
 
Hello
Does anybody know of some (PHP) scripts that I could use to parse a Word/RTF documents to HTML format.I run PHP on Red Hat Linux 6.2.For business reasons i am not in a position to install any additional libraries for this purpose.So can any one of u kindly tell me any PHP scripts for this purpose

Mik 06-27-2002 09:11 AM

You are looking for a PHP script to display rtf documents in html format?? Does that mean you want to convert them every time a user requests the document?
I don't know of any PHP scripts for that but there are converters for rtf to html. http://www.geocities.com/tuorfa/unrtf.html
You could run that on all your rtf docs so they can be read as html. If however your documents keep changing and you want to convert them each time (great resource hog), you could probably write a php script which runs the converter and then jumps to the generated html script. I think in that case you would be better off running a cron script at night which converts all the changed documents every so often. That will take up double disk space but will be a lot faster then having to convert the complete document each time the user requests it.

But those are just my ideas, maybe someone has already implemented a nicer solution for that.

saravanan1979 06-27-2002 09:24 AM

Dear Mik
Thanks a lot for ur reply my requirement is not that.My requirement is almost the same,but i want to read the Word/Excel/RTF documents uploaded by the user using PHP scripts and display it in the HTML format to the user.For this since i run PHP on linux i am not able to do that,on Windows i coud do this by calling the MS word object or Excel object.Can u plz help me out to solve the problem

Enforcer 06-30-2002 08:57 AM

Hi!

Check out this script..

http://px.sklar.com/code.html?code_id=413

..or do a search at..

http://www.phpbuilder.com

..for something like "rft html", you will find some entries you may found interesting..

amp2000 07-02-2002 02:39 PM

Quote:

My requirement is almost the same,but i want to read the Word/Excel/RTF documents uploaded by the user using PHP scripts and display it in the HTML format to the user.For this since i run PHP on linux i am not able to do that,on Windows i coud do this by calling the MS word object or Excel object.Can u plz help me out to solve the problem
I have the exact same problem, check out http://www.wvware.com/

I havent checked it out yet but it claims to do what we want.
Can you let me know if this works for you as I wont get a chance to try it out for a couple of days.

Cheer's

saravanan1979 07-03-2002 12:19 AM

wuware is a Linux utility u need to install in Linux and call the shell through PHP executable fucntions.But this does not resolve the problem i am looking for something native in PHP,since my application runs on 3-4 servers and the library needs to be insatlled in all the servers which is an impposible task.I am
still not sucessfull.Check out this
http://www.phpbuilder.com/columns/nair20020523.php3
it does gives u an guidance.I was able to partially trap the Bold ,Italics,Undeline in the Word Document.Please if u come across some source please do tell me i will also do the same.
Saravanan
Saravanan

amp2000 07-08-2002 06:11 PM

I dont know if this will be of any help but here's a bit of an e-mail that might be worth looking at, I'm still looking for a solution so post back here with anything you have saravanan1979
Cheer's

-----------------------------------------------------------------------------------
>However, all you have to do is make sure you set the content type
>properly:
>
>header('Content-Type: application/msword');
>readfile('file.doc');
>
>XXXXXXX
>
>ps. If application/msword doesn't work, try application/octet-stream.
>
>pps. For a full list of mime types, the mime.types file in your apache
>configuration.
>
>> Hi,
>>
>> I am trying to view Word Docs on our intranet.
>>
>> All I can get is a load of code.
>>
>> Someone suggested that I use wvHtml - www.wvware.com - to convert
>> the files
>> but I am not sure how to tell PHP to use it??
>>
>> I have used readfile but still junk!
-----------------------------------------------------------------------------------

saravanan1979 07-09-2002 12:19 AM

Dear Amp2000
ur above code will not work.Since Linux can can never read MS anbd RTF files.Ofcourse header( can be used for transfer word documents from once palce to another but it does not help to read the word document.
But i have done the 2 using 2 utilites
1.unrtf: RTF-->HTML
2.WvHTML:MSWORD--->HTML

The above 2 utilites have to be installed in Linux and we need to call them in PHP using PHP executable functions.The above 2 utilited are 2 good

javier.camacho 02-17-2010 03:45 PM

PHP script
 
You can try PHPWordLib:

The PHPWordLib is a piece of PHP software which is intended to convert MS Word (.DOC) and Rich Text Format (.RTF or .DOC) files to plain text. The PHP library is self contained and does not require absolutelly anything external in order to run. The library has two simple functions to use - LoadFile and GetPlainText. The library does all necessary checking internally and its functions will always return FALSE if something seems to be wrong with the input file.

It works over Linux, Windows, Solaris, just need PHP... google it

kabirkhanna 02-17-2010 07:35 PM

Wordlib
 
Quote:

Originally Posted by javier.camacho (Post 3867359)
You can try PHPWordLib:

The PHPWordLib is a piece of PHP software which is intended to convert MS Word (.DOC) and Rich Text Format (.RTF or .DOC) files to plain text. The PHP library is self contained and does not require absolutelly anything external in order to run. The library has two simple functions to use - LoadFile and GetPlainText. The library does all necessary checking internally and its functions will always return FALSE if something seems to be wrong with the input file.

It works over Linux, Windows, Solaris, just need PHP... google it

I looked at the site of Wordlib and tried to use their demo but it did not work. Have you tried using wordlib successfully ? If not, any other alternatives. - thanks

saravanan1979 02-18-2010 07:25 AM

Quote:

Originally Posted by kabirkhanna (Post 3867546)
I looked at the site of Wordlib and tried to use their demo but it did not work. Have you tried using wordlib successfully ? If not, any other alternatives. - thanks

Tried the ones I had mentioned earlier, thet work fine as well


All times are GMT -5. The time now is 06:06 PM.