LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 03-21-2010, 02:30 PM   #1
jan.goyvaerts
LQ Newbie
 
Registered: Mar 2010
Posts: 4

Rep: Reputation: 0
Question How to merge linked html pages into a single pdf.


Hi !

I have a book in html format, archived into one zip file. Starting from "index.html" there are links to the chapters, from there to sections, etc...

I'd like to make a single pdf so I can annotate it with Okular.

I have found many tools to convert a *single* html page into pdf. But none yet that is also able to follow links between the pages in order to create a single document out of it. It would be nice if the links could be preserved. But getting it all into a neat pdf is really the most important.

Does anyone in here know about a tool/method/way/... to do that ?

Thanks in advance !!!

Jan
 
Old 03-21-2010, 03:53 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,414

Rep: Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966Reputation: 1966
well html links are non linear, so whilst it might work in your example, you can't flatten them into a single document like that. Personally I'd look to whip off the header and footer sections of each page and concatenate the outputs to make a single page.
 
Old 03-21-2010, 04:01 PM   #3
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 248Reputation: 248Reputation: 248
You could always try htmldoc.

Last edited by smoker; 03-21-2010 at 04:07 PM.
 
Old 03-21-2010, 04:03 PM   #4
jan.goyvaerts
LQ Newbie
 
Registered: Mar 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by smoker View Post
You could always try htmldoc.
I did, but the book has quite a lot of pages. So I am hoping some software could do the same AND follow the links too. Damn...
 
Old 03-21-2010, 04:05 PM   #5
jan.goyvaerts
LQ Newbie
 
Registered: Mar 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by acid_kewpie View Post
well html links are non linear, so whilst it might work in your example, you can't flatten them into a single document like that. Personally I'd look to whip off the header and footer sections of each page and concatenate the outputs to make a single page.
That's a possibility. But how to specific the page breaks for the final pdf conversion ?
 
Old 03-21-2010, 04:26 PM   #6
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 248Reputation: 248Reputation: 248
Quote:
Originally Posted by jan.goyvaerts View Post
I did, but the book has quite a lot of pages. So I am hoping some software could do the same AND follow the links too. Damn...
I don't understand, htmldoc can handle as many pages as you give it.

There are even rpm packages for fedora as standard, I don't know about other distros.

It will create links for you, but I don't think it will convert existing links. Each new page gets its own link.

http://www.easysw.com/htmldoc/docfiles/3-books.html
 
Old 03-21-2010, 04:44 PM   #7
jan.goyvaerts
LQ Newbie
 
Registered: Mar 2010
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by smoker View Post
I don't understand, htmldoc can handle as many pages as you give it.

There are even rpm packages for fedora as standard, I don't know about other distros.

It will create links for you, but I don't think it will convert existing links. Each new page gets its own link.

http://www.easysw.com/htmldoc/docfiles/3-books.html
Does it keep the links between the pages ? Say a link from chapter 2 to chapter 10 ?
 
Old 03-21-2010, 05:12 PM   #8
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 248Reputation: 248Reputation: 248
I've just installed locally (I use on a server normally) and the gui is perfect.
Yes it does follow your links within your pages. I tested it with a 3 page set up and from page 3 you can jump back to page 1 or page 2 using an existing html link.

But it's quite picky about style. You must use an H1 heading on each page or it will ignore the page. It uses H1 headings as chapter markers.

Last edited by smoker; 03-21-2010 at 05:18 PM. Reason: headings not headers
 
  


Reply

Tags
html, links, pdf


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Merge Of Html Files Into A Single Html (or Pdf) fiomba Linux - Software 6 06-20-2011 07:28 PM
MS Publisher html pages for new web pages do not open in firefox, any suggestions?? Bwebman Linux - Newbie 3 06-13-2009 10:35 AM
cups-pdf command-line print .html pages without tags ? o5iri5 Linux - Software 1 08-06-2007 06:24 AM
How To Merge multiple files into a single PDF ? kkempter Linux - Software 1 10-28-2005 01:02 PM
Cookie Sharing Between CGI generated HTML pages and standard HTML pages rkwhited Linux - Newbie 5 08-15-2004 07:39 AM


All times are GMT -5. The time now is 09:49 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration