LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Reading OpenOffice .doc document (https://www.linuxquestions.org/questions/linux-general-1/reading-openoffice-doc-document-347499/)

satimis 07-27-2005 11:12 PM

Reading OpenOffice .doc document
 
Hi folks,

LFS LiveCD 6.1

Is there any way to "READ" OpenOffice .doc document?

I'm running LFS LiveCD to build LFS. The former runs Xfce as destop without word processing software built-in. My problem is how to read .doc documents stored on the HD. After mounting the corresponding partition on the HD I can retrieve .doc document but could not open/read it.

I'm not prepared to Remaster the LiveCD. Is there any solution?

TIA

B.R.
satimis

tuxdev 07-28-2005 04:15 PM

try to read it with your preffered text editor.

satimis 07-28-2005 05:40 PM

Hi tuxdev,

Quote:

try to read it with your preffered text editor.
I tried 'nano' and 'vim' without success. They are the editor on LFS LiveCD.

B.R.
satimis

kencaz 07-28-2005 05:52 PM

I would suggest burning a Knoppix or other LiveCD that has OpenOffice, Load your .doc files then export them as regular .txt files to be read in LFS 6.1...

KC

satimis 07-28-2005 06:10 PM

Hi kencaz,

Tks for your advice.

Quote:

I would suggest burning a Knoppix or other LiveCD that has OpenOffice......
I have other LiveCD with OO included available. The reason for me preferring LFS LiveCD is taking shorter time to boot because of the light-weight desktop 'xfce' . Besides it has the same fs as LFS 6.1 to be built. I found some opposite information about using Knoppix to build LFS. For such reasons I'm trying to solve the problem in making use of the packages included on the HD, the Host.

B.R.
satimis

archtoad6 08-06-2005 07:19 AM

OpenOffice.org (OOo) XML file format names
 
You mention "OpenOffice .doc document". Technically, .doc is an M$ format that OpenOffice.org supports. The native OpenOffice.org formats are all compressed XML (".sx?"). If by any chance you misspoke & actually need to read a .sxw file, then all you should have to do is unpack it before using your favorite/available text editor.

For everyone's (esp. mine) future reference, here is a table I copied & pasted from the OOo 1.1.3 help:
Code:

XML file format names

OpenOffice.org uses the following XML

Application                    File extension
OpenOffice.org Writer              *.sxw
OpenOffice.org Writer templates    *.stw
OpenOffice.org Calc                *.sxc
OpenOffice.org Calc templates      *.stc
OpenOffice.org Impress            *.sxi
OpenOffice.org Impress templates  *.sti
OpenOffice.org Draw                *.sxd
OpenOffice.org Draw templates      *.std
OpenOffice.org Math                *.sxm
Master documents                  *.sxg


satimis 08-06-2005 07:30 AM

Re: OpenOffice.org (OOo) XML file format names
 
Hi archtoad6,

Tks for your advice.

Quote:

....to read a .sxw file, then all you should have to do is unpack it before using your favorite/available text editor.....
Please explain how to unpack a .sxw file. Usually on OO Writer I just click File -> Open and then clicked the file to read it.

TIA

B.R.
satimis

archtoad6 08-06-2005 09:10 AM

From a little further down in the "XML File Formats" section of OOo help:
Quote:

XML file structure
The OpenOffice.org XML file formats are compressed according to the ZIP method. Use an unpacking program of your choice to unpack the content of an XML file with its subdirectories. You see a structure similar to the following illustration.

<could not paste image>

The text content of the document is located in content.xml.
By default, content.xml is stored without formatting elements like indentation or line breaks to minimize the time for saving and opening the document. On the Tools - Options - Load/Save - General tab page you can activate the use of indentations and line breaks by clearing the check box Size optimization for XML format (no pretty printing).
The file meta.xml contains the meta information of the document, which you can enter under File - Properties.
If you save a document with a password, all XML files except meta.xml will be encrypted.
The file settings.xml contains further information about the settings for this document.
In styles.xml, you find the Styles applied to the document that can be seen in the Stylist.
The meta-inf/manifest.xml file describes the structure of the XML file.
Additional files can be contained in the packed file format. For example, illustrations can be contained in a Pictures subdirectory, Basic code in a Basic subdirectory, and linked Basic libraries in further subdirectories of Basic.
unzip works, I tried it.

Warning, the actual text is in content.xml which is 1 line of XML id & 1 loooong line of "content". The actual words of the content are almost at the end in a series of "<text:p ... >" tags. Good luck reading it. less is probably just as good as vi.
Code:

$ unzip -d <extract_dir> <target_file>.sxw
$ cd <extract_dir>
$ less content.xml

If you need to work w/ the text, you probably should load Knoppix 1ce & save your target files in .txt format. Otherwise I see vi macros, or sed or awk scripts, with hairy regex's in your future.


All times are GMT -5. The time now is 03:41 AM.