LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-20-2008, 04:18 AM   #1
gawain
Member
 
Registered: Dec 2006
Location: Italy -Rome
Distribution: Slackware 11.0
Posts: 55

Rep: Reputation: 15
python and file formats


Hi eveyrbody

Does paython open files in doc, pdf and html format?

I mean, for instance

file_input = open("file.doc", "r")
file_input_1 = open("file.doc", "w")


or do I have to stick to txt format?

Thanks
 
Old 01-20-2008, 05:19 AM   #2
b0uncer
LQ Guru
 
Registered: Aug 2003
Distribution: CentOS, OS X
Posts: 5,131

Rep: Reputation: Disabled
It reads the files as text only (in Windows there is a difference between "read" and "binary read"). Different formats such as html, pdf, doc are then achieved by creating the file with some structure that is known. For example html is plaintext where the content is put inside different kinds of tags, and the rendering application reads the tags, interprets them and displays content based on that. So you can read html files (for example, or any other files) in Python, but it will just read the contents and you will have to deal with the content type yourself - tell Python how and what to read and where, and how to represent it. There is no builtin "read pdf" or "read doc" thing, there could be a "library" or something for that if somebody had made it, but I'm not sure.

If it was that easy to read pdf/doc/... and even write, why would we pay for MS Office and such?
 
Old 01-20-2008, 05:40 AM   #3
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by gawain View Post
Hi eveyrbody

Does paython open files in doc, pdf and html format?

I mean, for instance

file_input = open("file.doc", "r")
file_input_1 = open("file.doc", "w")


or do I have to stick to txt format?

Thanks
For Word document on windows, you can use COM object and the win32 module. See a sample here.
For PDF, you can use reportlabs third party library
 
Old 01-20-2008, 09:42 AM   #4
gawain
Member
 
Registered: Dec 2006
Location: Italy -Rome
Distribution: Slackware 11.0
Posts: 55

Original Poster
Rep: Reputation: 15
Thanks for the answers.

Do you mean that
Quote:
So you can read html files (for example, or any other files) in Python, but it will just read the contents and you will have to deal with the content type yourself - tell Python how and what to read and where, and how to represent it.
I need

Quote:
For Word document on windows, you can use COM object and the win32 module. See a sample here.
For PDF, you can use reportlabs third party library
?

I don't understand if the two ways are just one, or differ one from the other.
I don't need to change the files,or manipulate the strings in them, just read and extract some text.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
changing the file formats raja General 4 06-23-2006 02:26 PM
cpio file formats ziggy25 Linux - Software 3 10-31-2005 05:21 PM
linux file formats name* arnekasper Mandriva 4 07-14-2005 08:43 AM
ebook file formats jburford Linux - Software 3 05-10-2004 09:36 PM
Movie File Formats GtkUser Linux - General 6 10-03-2002 10:04 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:48 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration