Pdf downloader sometimes drops photos & hacks-up words & lines of text.
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Pdf downloader sometimes drops photos & hacks-up words & lines of text.
Running Mint 18.2, Cinnamon 3.4.3, Brave browser (Chromium-based), on a Dell i7 laptop.
I download lessons from an on-line course, as multi-page pdf's, using the provided "Print" function in Brave.
The problem is that the pdf's sometimes end-up with large, unpredictable fractions of the lesson's illustrations simply omitted; a blank space in the place of the illustration, with a border around the space, AND, also, words or parts of words in the text areas, at the end of a first line on a page may sometimes obscured by a dumb-looking logo or something, or even the top-half of an entire line of text at the top of a page may be simply lopped-off.
There seems to be no rhyme or reason for what ends-up missing or mangled, and a retry on another day may give what appears to be a perfect download, but I can'tbe sure of that unless I do a line-by-line comparison - hardly practical.
SEE ATTACHED PDF FOR AN EXAMPLE OF THE ISSUES
I thought maybe the issues could be due to time-outs, owing to variable traffic volumes on the public router I'm am connected to while downloading, in which case, switching to another pdf downloader would likely make no difference.
If the issue is external to my machine, is there a downloader that would deal with the issue?
Are you downloading a PDF? From your description it sounds more like you are trying to PRINT to pdf a WEB page.
The two are very different operations and the ways they can go wrong are VERY different!
PS: if you are trying to print to file a copyrighted lesson that is specifically restricted we may bot be able to help you for legal and ethical reasons.
There is (as mentioned) a great difference between printing from a web page and downloading a pdf then printing it.
Please tell us exactly what you are trying to do, step by step, and the results. Maybe then suggestions may be made. Also, as mentioned, it must be legal to print what you want to print.
I am downloading a webpage to pdf format. I understand that this would be very different from merely downloading a pdf. But I have had this issue crop up with other sites where no copyright could apply - like NASA sites dealing with public-domain documents.
The downloads I am doing are strictly for personal use, materials for a course I paid for, so it seems to me I would be able to download a single copy to my personal computer. As I understand the DMCA, the "Fair Use" clauses give reasonable ability to people to use copyrighted material, as long as they are either not profiting from the use, making a critique, or making limited and "transformitive" use of that material.
1.) For example, I navigate to the webpage I want to download, in this case (You may not be able to access this page as it is likely behind a paywall):
2.) With page 1 of the download visible, I hit the "customize and control Brave" icon, (icon consisting of 3 parallel horizonal lines), in the upper right corner of the Brave browser, and from that drop-down menu, hit the "Print" button, about 2/3 of the way down the list.
3.) After the "preview" has been generated, I hit the blue "Save" button in the lower-left of this pop-up screen.
4.) A smaller pop-up screen appears, giving the folder to which the dowload will be sent, I hit he "Save" button in the lower-right of this smaller window.
I don't use brave, but that sounds pretty generic.
With chrome I sometimes have to select "save as", not just "save", when downloading to get the actual file and not an html version.
I just went to that page using chrome, it seems to be the introductory page but has a lot of data. I then used crtl-P to print it. The print menu for my printer opened up and I selected to "save to pdf". That worked and the result was a properly formatted pdf document.
Note that I did not have to use the customize and control menu to print the page. ctrl-p is the shortcut for that.
I really don't understand your problem in saving the page unless you are not saving it as pdf but rather as the actual html code. Does your print dialog window allow you to select to save it as pdf?
Last edited by computersavvy; 06-12-2021 at 08:37 PM.
The choice I am seeing in the "Print" drop-down menu says "Save as pdf", so I guess that is what it is doing. PDF readers seem to treat the incomplete downloads as pdf's.
(Not relevant to our issue here, but I agree that page is VERY "introductory", which is why I am using the course only as a sort of outline, and reading a college-level mineralogy text book along side the "course").
Agree the issue is odd; esp as it happens now and then, randomly, which is why I thought perhaps it was due to traffic-dependent time-outs in the router or ISP server, terminating downloads before they were complete.
Traffic timeouts and interruptions could very well be the cause. However, if the entire page is displayed before you select to print it then you should be able to print to a pdf with no issues. If it is still downloading when you start the print then it will have glitches.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.