LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-21-2017, 06:36 AM   #1
Michael Uplawski
Senior Member
 
Registered: Dec 2015
Posts: 1,620
Blog Entries: 40

Rep: Reputation: Disabled
Looking for a tool to produce Markdown from HTML, man-page or Restructured Text


Hi.

I want to update a blog-post on LQ with the content of an updated man-page.

This had been cumbersome the last time and I do not feel like doing it again.

As the original text is written as Restructured Text, then converted to man, pdf and/or html,
Do you see a way to have it converted to the markdown-syntax that is understood by LQ?

Online converters are not up to the task; the HTML is either too long or just not the right kind of HTML and anyway, - the markdown produced is probably not good, either.

TIA,

Michael
 
Old 10-21-2017, 06:52 AM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,295
Blog Entries: 3

Rep: Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719
I'd look at one of these : http://neilb.org/reviews/markdown.html
Such as HTML::WikiConverter::Markdown

Most of the information is being thrown away so you could do something simple with HTML::Parser or XML::TreeBuilder.

Last edited by Turbocapitalist; 10-21-2017 at 06:54 AM.
 
Old 10-21-2017, 09:54 AM   #3
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
pandoc.
 
Old 10-21-2017, 02:48 PM   #4
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
+1 pandoc, it's what one uses to convert .md to html, I'm sure it handles the other way too.
 
Old 10-22-2017, 12:57 PM   #5
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Quote:
Originally Posted by Shadow_7 View Post
pandoc, it's what one uses to convert .md to html
nah, that would be markdown, actually.
___

but markdown doesn't do
Quote:
the other way too.
 
Old 10-22-2017, 01:53 PM   #6
Michael Uplawski
Senior Member
 
Registered: Dec 2015
Posts: 1,620

Original Poster
Blog Entries: 40

Rep: Reputation: Disabled
Thank you.

I have never touched Perl.
Pandoc does something, but obviously I am not seeing the point.

Solution: vi

Code:
:%s/<strong>\[^<]*<\/strong>/[B]\1[\/B]/g
:%s/<p><strong>\(.*\)<\/strong><\/p>/\r[B]\1[\/B]/g
:%s/<em>\(.*\)<\/em>/[I]\1[\/I]/g
and so on ...


Solution: I uploaded the HTML-version of the man-page to my web-site.

Last edited by Michael Uplawski; 10-22-2017 at 02:10 PM.
 
Old 10-23-2017, 01:48 AM   #7
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,295
Blog Entries: 3

Rep: Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719
perl is pretty easy if you keep it short.

Code:
#!/usr/bin/perl

use HTML::WikiConverter;
use strict;
use warnings;

my $file = shift || '/dev/stdin';

open (FILE, $file) or
    die("Can't open $file: $!\n");

my $html = do { local $/;  <FILE> };

close(FILE);

my $wc = new HTML::WikiConverter( dialect => 'Markdown' );

print $wc->html2wiki( $html );

exit(0);
That module converts to several dialects. See "man HTML::WikiConverter" for the details.
 
2 members found this post helpful.
Old 10-24-2017, 12:44 AM   #8
Michael Uplawski
Senior Member
 
Registered: Dec 2015
Posts: 1,620

Original Poster
Blog Entries: 40

Rep: Reputation: Disabled
Thanks a lot For the example.

Quote:
Originally Posted by Turbocapitalist View Post
perl is pretty easy if you keep it short.

Code:
#!/usr/bin/perl

use HTML::WikiConverter;
use strict;
use warnings;

my $file = shift || '/dev/stdin';

open (FILE, $file) or
    die("Can't open $file: $!\n");

my $html = do { local $/;  <FILE> };

close(FILE);

my $wc = new HTML::WikiConverter( dialect => 'Markdown' );

print $wc->html2wiki( $html );

exit(0);
That module converts to several dialects. See "man HTML::WikiConverter" for the details.
 
Old 10-24-2017, 01:26 AM   #9
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,295
Blog Entries: 3

Rep: Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719Reputation: 3719
No problem. A code sample is sometimes needed to get started the first time. There are actually two examples in that script, either of which are handy to reuse. Here is a shorter example with fewer distractions:

Code:
#!/usr/bin/perl

use HTML::WikiConverter;

use strict;
use warnings;

my $file = shift || '/dev/stdin';

my $wc = new HTML::WikiConverter( dialect => 'Markdown' );

print $wc->html2wiki( file => $file );

exit( 0 );
The manual pages for the CPAN modules nearly always have the essentials in an example.

Code:
man HTML::WikiConverter
 
  


Reply

Tags
html2markdown, lq blog, man2markdown, rst2markdown



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
text files in folder and display them as links on html page using AWK eiefriends8 Linux - Newbie 12 04-26-2017 11:44 AM
Create a Manual / Man Page for a Script or Tool Hi_This_is_Dev Linux - General 1 08-26-2010 01:33 PM
Easy string/text manipulation/indentation for restructured text brianmcgee Linux - Software 1 04-22-2008 08:27 PM
html page shows garbled text kaplan71 Linux - Server 1 04-09-2008 02:14 AM
how to produce a text file from man w/o formatting? spyghost Linux - Newbie 2 07-30-2003 06:05 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 05:34 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration