LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   shell script to remove tags (https://www.linuxquestions.org/questions/linux-newbie-8/shell-script-to-remove-tags-917266/)

lipun4u 12-05-2011 12:38 PM

shell script to remove tags
 
Can somebody give me a shell script which will remove all tags and scripts(java script/vb script code) from an HTML file ?

This might look like a homework question, but I am a c++ programmer and I need to bipass these things .

David the H. 12-05-2011 12:52 PM

There are several programs available that will convert html to plain text; html2text and unhtml are two that I know of. There are doubtless many more out there on the web.

If this doesn't satisfy your requirements then you'll have to be more specific about what you want to do.

TB0ne 12-05-2011 02:19 PM

Quote:

Originally Posted by lipun4u (Post 4542763)
Can somebody give me a shell script which will remove all tags and scripts(java script/vb script code) from an HTML file ?

This might look like a homework question, but I am a c++ programmer and I need to bipass these things .

Sorry, but this isn't the place to come to get scripts written for you. If you want to hire someone to write a script, post in the LQ Job Marketplace, and tell us how much you'll pay.

Otherwise, provide sample input and output data, and what you've written/done so far, and where you're stuck, and we can try to help. If you're a C++ programmer, you should know these things are needed to get any help with code writing. Also, based on your posting history, you've done lots with bash scripting in the past...this should be trivial for you to do.

knudfl 12-05-2011 03:07 PM

'html2text' : I would usually prefer "the other one" : html2txt

http://www.linuxquestions.org/questi...6&d=1292610942


All times are GMT -5. The time now is 11:28 PM.