Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448
Rep:
bash script for adding HTML tags
I have a bash script that encapulates a perl script, which generates some report data, and a cron that pipes this data to mail. The problem is that the data being generated is raw HTML code, without any HTML or BODY tags, so the content sent to email includes these raw tags and looks kind of ugly.
Is there a quick way to add a script somewhere that either encapsulates the output in HTML tags, or strips out the tags from the output (<b>, </b>, , etc.)?
Thanks. But what if tidy is not installed (not on this server)?
Either what druuna said, or edit the perl-script and stop
it from outputting the tags, or make it output the proper
header & body tags in the right spots, too? :}
It seems a bit silly to run something that produces extra
data and then use another tool to undo the efforts of the
first tool.
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448
Original Poster
Rep:
Here's the thing: the perl scripts are part of TWiki, one of which is spitting out the raw HTML. I'd rather not modify the TWiki code itself, and I've asked the owner of this particular bit of code to consider adding a flag for output format, but no luck so far.
In addition, I have limited access to installing new packages on the system, as the admins prefer prepackaged RPMs from their centralized repositories.
I thought a solution might be as "simple" as adding a step to the cron that threw the script output into a variable, and I could somehow add the necessary HTML tags "around" the variable, so the end result was a readable email.
However, if the viable solution is that tidy would be just an additional pipe in the cron command ("perl script | tidy | mail"), it makes sense to request an exception and get tidy installed on the server.
Well, you *could* brute-force it and (assuming that none of the pages
payload has "<" or ">" in it) use a pipe through sed to mangle the
pages twiki produces. The simple sed command below should get rid of
most HTML I'd think.
Or, if you really wanted to get frankenstein, you could have your shell script run a perl script that runs the aformentioned report generation. Thus, you could have the wrapper script parse the data however you want.
Still though I see the easiest and cleanest solution is to add several very minor alterations to the initial code
Distribution: FreeBSD, Fedora, RHEL, Ubuntu; OS X, Win; have used Slackware, Mandrake, SuSE, Xandros
Posts: 448
Original Poster
Rep:
Things are already fairly Frankensteined in support of this application, so while I'd like to avoid further customization, another bolt here or there shouldn't make much of a difference.
Tink's sed command could be a good start. Since it strips out angle-bracketed tags, it also strips out line breaks (obviously), so the output is all on one line, and non-breaking space symbols ( ) are left in.
I looked at Tidy, but couldn't find a binary for my Linux (RHEL4), and the source is C-based, which I've never compiled on Linux.
Finally, I'm looking at the code of TWiki's publishing script (Publish.pm) for the right place to insert:
Code:
<html><body>[output]</body></html>
... tags around the current output, and how. I thought the variable to be modified would be $tmpl, and the place would be right before this line that writes to HTML:
However, this would modify the actual file output, which seems to be working as expected. I just need to modify the standard output sent by the script itself (normally sent to the screen, but being piped to mail), not the file being produced by the script.
Last edited by deesto; 03-12-2008 at 04:53 PM.
Reason: fixed "smiley" rendered in code
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.