LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Thread Tools
Old 01-31-2005, 02:20 PM   #1
fiomba
Member
 
Registered: Sep 2004
Posts: 63
Thanked: 0
Merge Of Html Files Into A Single Html (or Pdf)


[Log in to get rid of this advertisement]
Using wget ( 'i' option) you can download from Internet several pages,
which can represent the various parts of a lenghty document (manual or tutorial).

Is there in Linux a sw to merge the html pages
in a single html file (or better in a PDF file)?
fiomba is offline     Reply With Quote
Old 01-31-2005, 02:43 PM   #2
Artanicus
Member
 
Registered: Jan 2005
Location: Tampere, Finland
Distribution: Gentoo, Slackware
Posts: 818
Thanked: 0
well, none that ive heard of, but imho such a simple task doesn't need a specific program.
Ill demonstrate via an example:
Code:
grep -iv "</body>" downloads/html/site/page1.html | grep -iv "</html>" | grep -iv "</HTML>" | grep -iv "</BODY>" > downloads/html/site/all.html
grep -iv "</body>" downloads/html/site/page2.html | grep -iv "</html>" | grep -iv "</HTML>" | grep -iv "</BODY>" >> downloads/html/site/all.html
grep -iv "</body>" downloads/html/site/page3.html | grep -iv "</html>" | grep -iv "</HTML>" | grep -iv "</BODY>" >> downloads/html/site/all.html
#..repeat as long as neccessary..
echo "</body></html>" >> downloads/html/site/all.html
If the code is nicely formed (aka has the ending tags </html> and </body> on seperate / a seperate line(s)) This will just grab them out and merge into a single html which can be converted to whatever you want..

The result of the site merge wont be pretty, and can break the code, but thats what you get when automating this kinda stuff anyways..

Thr above code would still need to be modified to take out the start tags of the new files also, but you hopefully get my point and can modify it yourself.. Thats the way to learn anyways.. (:

Last edited by Artanicus; 01-31-2005 at 02:56 PM..
Artanicus is offline     Reply With Quote
Old 02-10-2005, 07:46 PM   #3
fiomba
Member
 
Registered: Sep 2004
Posts: 63
Thanked: 0

Original Poster
Thank you for your replay. I have followed what you suggest (but in a more drastic way):
I simply merged the html files as they were, without beeing warried of having, for example 100 <html> tags.
For my purposes, that is toprint the manual or tutorial, it worked perfectly.

The only problem is to get the correct list of the html files as defined by the Table of Content, and to merge them in that order.
But with a little script...
fiomba is offline     Reply With Quote
Old 05-30-2007, 02:37 PM   #4
guthrie
LQ Newbie
 
Registered: Jul 2003
Location: Iowa
Distribution: Debian
Posts: 21
Thanked: 0
merge HTML's to PDF

Seee: htmlDoc.

it is open source.
http://www.htmldoc.org/
(A $$ version adds a GUI)
guthrie is offline     Reply With Quote

Reply

Bookmarks


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
merge multiple pdf files esteeven Linux - Software 8 11-21-2007 02:59 PM
How To Merge multiple files into a single PDF ? kkempter Linux - Software 1 10-28-2005 02:02 PM
html code and including html files Hockeyfan Programming 2 08-22-2005 06:11 PM
print files in PDF or html format from the linux command line IBKnobel Linux - Software 3 07-12-2004 10:29 PM
Converting html files to pdf saurya_s Linux - Software 1 01-12-2004 07:49 AM


All times are GMT -5. The time now is 02:33 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
RSS2  LQ Podcast
RSS2  LQ Radio
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration