LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Some web pages can't saved in PDF format. (https://www.linuxquestions.org/questions/linux-software-2/some-web-pages-cant-saved-in-pdf-format-4175551681/)

hack3rcon 08-25-2015 07:52 AM

Some web pages can't saved in PDF format.
 
Hello.
I want to save a web page as PDF like below in Google Chrome :

http://www.tecmint.com/install-java-jdk-jre-in-linux/

But when I press "Ctrl+P" the page format is not correct :/. How can I save it?

Thank you.

fatmac 08-25-2015 08:35 AM

Print it 'to a file'.

John VV 08-25-2015 03:07 PM

Quote:

Some web pages can't saved in PDF format.
i would expect that most sites will not save well as a pdf
they are javascript and php and perl and ruby driven from databases and from other sites

you can try to save it a a postsctipt ( print to file) a "*.ps" file or a pdf from "print to file"

-- i just tried
the pdf IS GARBAGE and almost unusable the page formatting is TRASHED and basically unusable crap
NONE!!!! of the "code" sections are usable they are ALL cut off and unusable



but for replacing OpenJDK with Oracle java you REALLY SHOULD use your packagemanager
yum
dnf
zypper
apt-git
packman

use what is in your OS's repos

then "alternatives" will be taken care of

Ihatewindows522 08-25-2015 03:19 PM

Can you just save the page as a local HTM?

Something like this?

frankbell 08-25-2015 08:51 PM

You could try saving the page to HTML from your browser, then opening the HTML in LibreOffice or OpenOffice and exporting it the *.pdf.

I know that works, but I don't know whether any links would remain clickable unless you reconfigured them in the word processor.

hack3rcon 08-25-2015 11:59 PM

My Chromium can't save web page as a file, I just see "Save as PDF" and print using my printer :(

Ihatewindows522 08-26-2015 09:49 AM

Yes it can.

Menu >> Save Page As

rtmistler 08-26-2015 10:02 AM

With the whole variety of web pages forms out there as well as what I coin as active content, such as flash, or embedded video, plus animated images I'm sitting here wondering what the benefit/requirement is.

You can save it likely as HTML, as web page complete, as text, and also take a screen shot. Pretty much most of those things you can then import into a PDF. Likely I'd follow a procedure like that so that I could make the presentation appear as I wanted it too, not just raw information. For instance, a black rectangle where an active image or video was, I'd probably fix that.

And maybe you can save it directly as PDF, but the question here really is what you're saving and how you intend to use it beyond just grabbing the content.

hack3rcon 08-29-2015 09:21 AM

Quote:

Originally Posted by Ihatewindows522 (Post 5411441)
Yes it can.

Menu >> Save Page As

Then, Import saved files into LibreOffice?

hack3rcon 08-29-2015 09:46 AM

I saved file and open it in Libreoffice but some photo can't be loaded from internet.
Can you save "http://www.tecmint.com/install-openldap-server-and-administer-with-phpldapadmin-in-debianubuntu/" as PDF for me?

teckk 08-30-2015 04:35 PM

Some examples: I tried this example, header is awful but body is fine. With wkhtmltopdf
Code:

wkhtmltopdf http://www.tecmint.com/install-java-jdk-jre-in-linux/ mypage.pdf
This example works fine, gives you a nice looking .pdf. With Lynx and a2ps and ps2pdf
Code:

lynx -dump -nolist -nomargins http://www.tecmint.com/install-java-jdk-jre-in-linux/ | a2ps -B --borders=0 --columns=1 -o - | ps2pdf - mypage.pdf
To html
Code:

curl -A "Mozilla/5.0" -Ls "http://www.tecmint.com/install-java-jdk-jre-in-linux/" -o - > mypage.html
Code:

wget -U "Mozilla/5.0" "http://www.tecmint.com/install-java-jdk-jre-in-linux/" -O - > mypage.html
To text This example worked fine.
Code:

lynx -dump "http://www.tecmint.com/install-java-jdk-jre-in-linux/" >  mypage.txt
Code:

w3m -dump -T text/html "http://www.tecmint.com/install-java-jdk-jre-in-linux/" >  mypage.txt
Google if you want more.

Ihatewindows522 08-31-2015 02:03 PM

Quote:

Originally Posted by hack3rcon (Post 5412895)
I saved file and open it in Libreoffice but some photo can't be loaded from internet.
Can you save "http://www.tecmint.com/install-openldap-server-and-administer-with-phpldapadmin-in-debianubuntu/" as PDF for me?

Well, the raw PDF print from Firefox doesn't look good at all, after importing into LibreOffice 5...all the images loaded fine, but there are some Javascript elements pertaining to the images (I'm guessing a zoom feature) that didn't quite render right.

After I got rid of the JS elements, I exported it as a PDF and cut out the nonessentail pages with PDF Mod.

https://drive.google.com/file/d/0B6m...ew?usp=sharing


All times are GMT -5. The time now is 10:29 PM.