LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   LQ Suggestions & Feedback (https://www.linuxquestions.org/questions/lq-suggestions-and-feedback-7/)
-   -   Download a thread (https://www.linuxquestions.org/questions/lq-suggestions-and-feedback-7/download-a-thread-4175717890/)

Debian6to11 10-18-2022 03:10 PM

Download a thread
 
Is it possible to download a thread as text?

boughtonp 10-18-2022 03:28 PM


 
Thread can mean different things. I'm guessing you are you asking about LQ forum threads, but you really should clarify.

Since a forum thread is rendered as HTML, which is text, and you can either view source or use Curl to download that HTML, then the answer is technically yes.

But if you're asking does the forum software contain a render to non-HTML text feature, given the absence of such an option in Thread Tools, the answer is no.

However, there is a simplified "Printable Version" which one could run through a HTML to Markdown converter - though it would need special handling to deal with quoted sections appropriately, since they don't use semantic markup, and whether that's worth the effort depends on what you're after. It may be simpler to just copy-paste.


Debian6to11 10-18-2022 11:30 PM

Yes I was thinking about a thread here at LQ, I should have clarified.

Basically what I meant as text, is just text, no HTML which is unnecessary and not very useful as a note or insert in a text file. I guess copy and paste could do that then.

artytux 10-19-2022 12:21 AM

Hello
I wonder if the idea of save as pdf would be of use IF you are using Firefox their addons are OK
There is two I use, one of is really good ((Save as PDF by Pdfcrowd)blue icon),
the other bit flaky at times but very light ((Save PDF Save current page as PDF)red icon) and bonus works offline also

Quote:

Basically what I meant as text, is just text, no HTML which is unnecessary and not very useful as a note or insert in a text file. I guess copy and paste could do that then.
Regards
artytux

!!! 10-19-2022 01:17 AM

Look at the very top immediately above whatever #<number, like 1,16,31,...> of any post's page, and you will see 3 pulldowns: 'Thread Tools' (& Search this Thread & Rate Thread, which makes the concept of a thread infinitely obvious). The first pull down under 'Thread Tools' is: 'Show Printable Version'. Once you have selected this, you will get all the posts on one simple (html) text page, ready for you to hit ctrl+p or whatever (usually you can: 'print to a file', depending on your OS).

Debian6to11 10-19-2022 04:44 AM

Thank you all, I see that the answer was on post 2, but it was kind of complicated.

onebuck 10-19-2022 05:26 AM

Moderator Response
 
Moved: This thread is more suitable in <LQ Suggestions & Feedback > and has been moved accordingly to help your thread/question get the exposure it deserves.

boughtonp 10-19-2022 06:37 AM

Quote:

Originally Posted by artytux (Post 6387087)
I wonder if the idea of save as pdf would be of use IF you are using Firefox their addons are OK

There is no need to use addons for this - Firefox can produce a PDF of the current page using File > Print > Print To File.


boughtonp 10-19-2022 06:44 AM

Quote:

Originally Posted by Debian6to11 (Post 6387083)
Basically what I meant as text, is just text, no HTML which is unnecessary and not very useful as a note or insert in a text file. I guess copy and paste could do that then.

The downside to copy and paste is you lose the nesting/demarcation of quotes - for longer threads/posts with lots of interleaved quotes that becomes a bigger issue.

However, I was curious so I grabbed a suitable Python module and, well the result isn't great, but is better than I expected, and might be acceptable.

1. Download html2text (either file works) and extract just the "html2text" directory to current directory.
1b. Briefly review the code to make sure it's not doing anything weird. (General good practice when downloading from unreviewed public repos like pypi.)

2. Right click and "Save Link As" on "Show Printable View" to get the simplified HTML file.

3. The module can be run from command line with, e.g: "python3 -m html2text ./input.html > ./output.txt"

4. Load output.txt in your text viewer of choice (or pipe the output to "less" if one just wants a quick preview).

That was just the first module I found, there may well be others that do a better job, or it may itself work better by using some of the various options it has (see "python3 -m html2text --help")


teckk 10-19-2022 02:53 PM

Quote:

Is it possible to download a thread as text?
Code:

url="https://www.linuxquestions.org/questions/lq-suggestions-and-feedback-7/download-a-thread-4175717890/"

lynx -dump "$url" > out.txt


artytux 10-20-2022 02:40 AM

Quote:

Originally Posted by boughtonp (Post 6387131)
There is no need to use addons for this - Firefox can produce a PDF of the current page using File > Print > Print To File.


OK Strange this I never knew that was there, awesome, works and so easy, never looked why would I click that when I have no printer,
Thanks heaps boughtonp
regards
artytux


All times are GMT -5. The time now is 11:38 PM.