LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-11-2008, 12:43 PM   #1
deesto
Member
 
Registered: May 2002
Location: NY, USA
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448

Rep: Reputation: 31
Question bash script for adding HTML tags


I have a bash script that encapulates a perl script, which generates some report data, and a cron that pipes this data to mail. The problem is that the data being generated is raw HTML code, without any HTML or BODY tags, so the content sent to email includes these raw tags and looks kind of ugly.

Is there a quick way to add a script somewhere that either encapsulates the output in HTML tags, or strips out the tags from the output (<b>, </b>, &nbsp;, etc.)?
 
Old 03-11-2008, 03:08 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Just run tidy against it.



Cheers,
Tink
 
Old 03-11-2008, 04:58 PM   #3
deesto
Member
 
Registered: May 2002
Location: NY, USA
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448

Original Poster
Rep: Reputation: 31
Thanks. But what if tidy is not installed (not on this server)?
 
Old 03-11-2008, 05:06 PM   #4
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387
Hi,

Install it

HTML Tidy Library Project
 
Old 03-11-2008, 05:09 PM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by deesto View Post
Thanks. But what if tidy is not installed (not on this server)?
Either what druuna said, or edit the perl-script and stop
it from outputting the tags, or make it output the proper
header & body tags in the right spots, too? :}

It seems a bit silly to run something that produces extra
data and then use another tool to undo the efforts of the
first tool.



Cheers,
Tink
 
Old 03-11-2008, 07:25 PM   #6
deesto
Member
 
Registered: May 2002
Location: NY, USA
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448

Original Poster
Rep: Reputation: 31
Here's the thing: the perl scripts are part of TWiki, one of which is spitting out the raw HTML. I'd rather not modify the TWiki code itself, and I've asked the owner of this particular bit of code to consider adding a flag for output format, but no luck so far.

In addition, I have limited access to installing new packages on the system, as the admins prefer prepackaged RPMs from their centralized repositories.

I thought a solution might be as "simple" as adding a step to the cron that threw the script output into a variable, and I could somehow add the necessary HTML tags "around" the variable, so the end result was a readable email.

However, if the viable solution is that tidy would be just an additional pipe in the cron command ("perl script | tidy | mail"), it makes sense to request an exception and get tidy installed on the server.
 
Old 03-11-2008, 08:03 PM   #7
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Well, you *could* brute-force it and (assuming that none of the pages
payload has "<" or ">" in it) use a pipe through sed to mangle the
pages twiki produces. The simple sed command below should get rid of
most HTML I'd think.

sed -r 's/<[^>]+>//g'



Cheers,
Tink
 
Old 03-11-2008, 08:20 PM   #8
Poetics
Senior Member
 
Registered: Jun 2003
Location: California
Distribution: Slackware
Posts: 1,181

Rep: Reputation: 49
Or, if you really wanted to get frankenstein, you could have your shell script run a perl script that runs the aformentioned report generation. Thus, you could have the wrapper script parse the data however you want.

Still though I see the easiest and cleanest solution is to add several very minor alterations to the initial code
 
Old 03-12-2008, 11:38 AM   #9
deesto
Member
 
Registered: May 2002
Location: NY, USA
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448

Original Poster
Rep: Reputation: 31
Things are already fairly Frankensteined in support of this application, so while I'd like to avoid further customization, another bolt here or there shouldn't make much of a difference.

Tink's sed command could be a good start. Since it strips out angle-bracketed tags, it also strips out line breaks (obviously), so the output is all on one line, and non-breaking space symbols (&nbsp; ) are left in.

I looked at Tidy, but couldn't find a binary for my Linux (RHEL4), and the source is C-based, which I've never compiled on Linux.

Finally, I'm looking at the code of TWiki's publishing script (Publish.pm) for the right place to insert:
Code:
<html><body>[output]</body></html>
... tags around the current output, and how. I thought the variable to be modified would be $tmpl, and the place would be right before this line that writes to HTML:
Code:
    $this->{archive}->addString( $tmpl, $topic.$filetype);
However, this would modify the actual file output, which seems to be working as expected. I just need to modify the standard output sent by the script itself (normally sent to the screen, but being piped to mail), not the file being produced by the script.

Last edited by deesto; 03-12-2008 at 05:53 PM. Reason: fixed "smiley" rendered in code
 
Old 03-12-2008, 07:40 PM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,241

Rep: Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325Reputation: 2325
As this is a Perl app, you could also ask at www.perlmonks.org. There's some very sharp people there.
 
  


Reply

Tags
bash, html, output, script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script for sorting and renaming multiple mp3 files by id3 tags simonloach Linux - General 8 02-16-2013 10:07 AM
BASH script emulation of HTML POST method ? bjh Programming 2 02-10-2008 06:30 PM
Adding timestamp to a BASH script thefox Linux - Software 1 11-10-2007 10:29 PM
script to grab html content from between specific tags sonicthehedgehog Programming 6 01-30-2007 02:14 PM
Bash script for correcting HTML tags hq4ever Programming 4 11-08-2004 05:06 AM


All times are GMT -5. The time now is 03:46 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration