converting a doc/ppt Windows files in html files under linux env
Hello everybody,
I have to deal on converting the Windows Word documents and ppt documents in html docs. This has to be done using Linux environment and C++ as language programing and the target will be a standalone application, which may use some other libraries. I have already done the conversion from pdf to html using xpdf, which provides the pdf structure. Yesterday, I spent all day long searching for an idea, and came across OpenOfice, which it appears that it could give me the structures from a doc/ppt file, but for this the OpenOffice server must run... so, no more standalone app. Can someone please point me to some documents to read about this? Thank you so much and have a good day ! |
OpenOffice being a standalone application more than a server, why not leveraging on it and use it for batch processing ?
See: http://www.xml.com/pub/a/2006/01/11/...penoffice.html http://www.indesko.com/en/downloads/ooo2dbk |
Quote:
I also found wvware application which seems to do a pretty good job, and for the time part too. Maybe i can do some hacks into this application to reach my target. But anyway, this will resolve only doc part. It will still remain the ppt part. Anybody can help me with this please? |
OO is handling ppt too.
|
Quote:
I reached to ppthtml (which is part from xlhtml). This does the conversion from ppt to html, but a poor one. I'll keep researching.... |
5.7 seconds doesn't looks to me a "big amount of time", you are an impatient (wo)man ! ;)
Moreover, you didn't try to convert the ppt, did you ? |
Quote:
Also, I found some other applications that converts the ms docs into either text or html. These are antiword, catdoc and word2x. Maybe I could inspire from them .... :scratch: |
Well, if performance is an issue, use a faster machine !
|
All times are GMT -5. The time now is 08:24 PM. |