Need to convert man page to .odt or .doc and back again
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Need to convert man page to .odt or .doc and back again
Hi all,
I would like to know if there is a utility to convert a man page file to a Libre Office .odt or .doc file and then back again?
I'm working on adding a lot of features and functions to an existing package and of course wish to update it's man page accordingly.
So, here's what I'm hoping to do:
(1) Convert the man page file to an .odt or .doc file.
(2) Open it with Libre Office Writer and edit it.
(3) Convert the edited .odt or .doc file back to a man page file.
NOTE that I need to preserve the existing formatting, since the man page utilizes bold and underline.
BTW, my system is Debian 8, x86_64 if that makes any difference.
Take a look at http://docbook.org for a glimpse "behind the scenes" at the technology that is used at the source of many forms of documentation – including man-pages. There is a root-source document from which man-page content is generated, and this same technology can generate other forms of output.
For instance, all of those O'Reilly Books that we used to pore over, were all written in DocBook, and all of the derivative publications that they produced came from the same sources.
So, you wouldn't take the man-page-formatted output: you'd go back to its XML source, and produce your other output from it.
Welcome to the wonderous world of "XSLT."
Last edited by sundialsvcs; 07-31-2017 at 11:39 AM.
NOTE that I need to preserve the existing formatting, since the man page utilizes bold and underline.
Formatting or structure? The data format for man pages contains a heck of a lot of structural information that you would 100% lose if you were to just look at the layout and presentation.
Anyway, I digress. Your approach will not work to get to your stated goal of building out an existing man page. The only way to do that is to use the format they use. See the macros documentation for that:
Code:
man 7 man
It's not difficult and it is precise.
Over in OpenBSD, there have been a lot of refinements in regards to manuals:
Take a look at http://docbook.org for a glimpse "behind the scenes" at the technology that is used at the source of many forms of documentation – including man-pages. There is a root-source document from which man-page content is generated, and this same technology can generate other forms of output.
For instance, all of those O'Reilly Books that we used to pore over, were all written in DocBook, and all of the derivative publications that they produced came from the same sources.
So, you wouldn't take the man-page-formatted output: you'd go back to its XML source, and produce your other output from it.
Welcome to the wonderous world of "XSLT."
I was really looking for a utility where my lazy self could just do something like this:
$utility < manpage.1 > manpage.doc
And, by the way, my source file isn't in XML format, it looks like this (a small snippet of course):
.IP "\fBdefdbl\fP \fIvariable\fP[\fB\-\fP\fIvariable\fP]" "{{{
Declare the global \fIvariable\fP as real. Only unqualified variables (no
type extension) can be declared. Variable ranges can only be built from
single letter variables. Declaration is done at compile time.
."}}}
.IP "\fBdefint\fP \fIvariable\fP[\fB\-\fP\fIvariable\fP]" "{{{
Declare the global \fIvariable\fP as integer.
."}}}
.IP "\fBdefstr\fP \fIvariable\fP[\fB\-\fP\fIvariable\fP]" "{{{
Declare the global \fIvariable\fP as string.
."}}}
...which (after conversion) in the document editor would look like this:
defdblvariable[-variable]
Declare the global variable as real. Only unqualified variables (no type extension) can be declared. Variable ranges can only be built from single letter variables. Declaration is done at compile time.
defintvariable[-variable]
Declare the global variable as integer.
defstrvariable[-variable]
Declare the global variable as string.
...and when converted back to a .man file would display the same way (aside from breaking to fit the current screen width).
No. Sorry. Typographic markup is about the best that word processors can do. There can be some structural markup at the block level (headings, paragraphs, etc.) but that is not enough.
I reiterate my suggestion to look at the original format:
Anyway, I digress. Your approach will not work to get to your stated goal of building out an existing man page. The only way to do that is to use the format they use. See the macros documentation for that
I find that difficult to believe. If I had to do this more than once, I could easily write a console mode (CLI) utility to convert the "/fB" to bold and "/fI" stuff to bold and underline then generate a .doc file.
I also can't imagine that in 2017 people still use awkward cryptic coding to make a man page.
It SHOULD be as easy as firing up a word processor, writing the text (complete with formatting) and then "save as man page".
Or, fire up the word processor, LOAD the man page file and have it auto-converted to standard "document" format, which could be edited then saved back as a man page.
Yes. Unfortunately. The information theory behind it does not allow for it. See the difference between typographic markup and structural markup. You'd have to add a whole new level of complexity to your text editor. Levels that have only previously had in full-blown XML or SGML editors. That level of detail takes a lot of effort to work with. I've done that at large scales in the past back when SGML was all the rage.
You and I could probably have a lazy-contest, but it'd be too much effort. I've made small changes to lots of manual pages. I found it far easier to work with the macros than previous work with XML or SGML. My approach has been to just fire up a text editor and use the macros. Compared to the high-powered electronic publishing methods, it's worlds easier.
Yes. Unfortunately. The information theory behind it does not allow for it. See the difference between typographic markup and structural markup. You'd have to add a whole new level of complexity to your text editor. Levels that have only previously had in full-blown XML or SGML editors. That level of detail takes a lot of effort to work with. I've done that at large scales in the past back when SGML was all the rage.
You and I could probably have a lazy-contest, but it'd be too much effort. I've made small changes to lots of manual pages. I found it far easier to work with the macros than previous work with XML or SGML. My approach has been to just fire up a text editor and use the macros. Compared to the high-powered electronic publishing methods, it's worlds easier.
Wow. I'm actually amazed. It LOOKS like it should be simple to do.... I guess I'm missing something important.
Just putting bold or italics here and there would be simple, but there is much more actually going on under the hood. I think the current buzzword might be "semantic markup" rather than "structural markup". If you look up the former, you might find some relevant material.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.