LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-28-2017, 02:06 PM   #1
mpat86
LQ Newbie
 
Registered: Jul 2017
Posts: 2

Rep: Reputation: Disabled
Shell script required to identify if a cron job is stuck for for several days and notify users through email.


Recently, one of our cron jobs was stuck for several days and we were not aware of the situation.

We want to build a shell/bash script to monitor whether the log trace is rolling.If the log trace is not rolling since quite a long period of time, we want to send an email to our Tech team with the details.

P:S: - We have multiple cron jobs and we want to implement the same for all the jobs. Right now, we are not storing the details of the processing of the cron jobs in any of the log files.
 
Old 07-28-2017, 02:19 PM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 5,655
Blog Entries: 3

Rep: Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901Reputation: 2901
Welcome.

Another way might be to put a time limit on how long the job can take and kill it if it goes over the time limit. timeout can do that:

Code:
timeout 600 /usr/local/bin/someslowscript || echo "Script Failed" | mail -s "Fail" techteam@example.com
See "man timeout" If the script is completed before the time limit, there is not problem. If it goes over time it is killed and a mail sent.

A prerequisite for all that however is the ability to send mail from that machine. Is it set up?

Code:
echo "This is a test.  $(date)" | mail -s "A test" techteam@example.com
 
Old 07-28-2017, 04:00 PM   #3
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,513

Rep: Reputation: 1009Reputation: 1009Reputation: 1009Reputation: 1009Reputation: 1009Reputation: 1009Reputation: 1009Reputation: 1009
You can configure cron to send an email on errors. Put this at the top of crontab:
Code:
MAILTO=someuser
 
Old 07-30-2017, 09:07 AM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 23,912

Rep: Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014
Quote:
Originally Posted by mpat86 View Post
Recently, one of our cron jobs was stuck for several days and we were not aware of the situation. We want to build a shell/bash script to monitor whether the log trace is rolling.If the log trace is not rolling since quite a long period of time, we want to send an email to our Tech team with the details.

P:S: - We have multiple cron jobs and we want to implement the same for all the jobs. Right now, we are not storing the details of the processing of the cron jobs in any of the log files.
What AwesomeMachine posted will work fine, if your scripts are written to return an error. Are they? Or did you mean the process was hung, and never returned at all?? There are lots of way to accomplish this, but they are going to depend on how your scripts work now and what they're doing/calling, and how long these jobs should take to run.

First wild idea I'd suggest is to write a simple shell script, to do a "ps -ef" and look for the name of any of your cron jobs, and run this on an off-cycle schedule. Meaning that if your cron job normally runs every 5 minutes, run the 'checker' every 7 minutes, so you won't catch any jobs running normally. Again, timing will depend on how often these jobs run, how long they take to complete, etc. If this shell script finds the cron script present, it will send an email to whomever. Could even have that script loop through a simple text file with the names of any of your cron scripts and check them all.

But more details are needed.
 
Old 07-31-2017, 10:34 AM   #5
mpat86
LQ Newbie
 
Registered: Jul 2017
Posts: 2

Original Poster
Rep: Reputation: Disabled
Thanks all for your reply.

Hi LQ Guru,

In my case, the process was hung and it never returned. This particular job runs once in every 3 hrs 30 mins and sometimes, it runs for more than 5 hrs(depending on the load). The average run time of this job is 2 hrs. We already have some shell scripting in place that checks for duplicate processes and in case a duplicate process is found, it won't trigger the next run.

We are basically looking for a generic solution that would identify if any of the cron jobs are hung and send an email notification to the users if found any. We have 10 cron jobs in total.
 
Old 07-31-2017, 11:46 AM   #6
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 23,912

Rep: Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014Reputation: 7014
Quote:
Originally Posted by mpat86 View Post
Thanks all for your reply.
In my case, the process was hung and it never returned. This particular job runs once in every 3 hrs 30 mins and sometimes, it runs for more than 5 hrs(depending on the load). The average run time of this job is 2 hrs. We already have some shell scripting in place that checks for duplicate processes and in case a duplicate process is found, it won't trigger the next run.

We are basically looking for a generic solution that would identify if any of the cron jobs are hung and send an email notification to the users if found any. We have 10 cron jobs in total.
Not any generic solution in this case. If the job is just hung, it's not returning any errors at all...just sitting there. Again, you're going to have to modify your scripts and do some good planning to make this work. Sounds like you already know average run times, so you have some data points at least.

Unless you write the process checker as suggested, I don't see much past looking into whatever real commands are IN those cron scripts, to see if they have any built-in error/process checking, and enabling those if they do.
 
Old 07-31-2017, 12:04 PM   #7
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.8.2003
Posts: 5,432

Rep: Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055Reputation: 2055
Psudocode:
get the start time of the cron job in seconds:

Code:
STARTTIME=$(ps -ef | grep <name of script> | grep -v grep | cut the start time | date +%s)
get the current time from the output of the date command in seconds:

Code:
NOW=$(date +%s)
if the difference is greater than <3 hours>, send email

Code:
if [ $STARTTIME - $NOW > 10800 ] do
   mailx -s "<name of script> appears to be hung" support@yourdomain.com
The <name of script> could be fed in at the command line from a file
or
by parsing the output of crontab -l, which would be more dynamic...addition or removal of a cron job would automagicially be added or removed from the process.

Take a shot at that. Get back to us if you run into problems.

Last edited by scasey; 07-31-2017 at 07:24 PM.
 
Old 08-02-2017, 10:50 AM   #8
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,910

Rep: Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512
One option you have is to do a timout - a small shell script put in the background for a given amount of time...

timeout 3h $$
Code:
#!/bin/sh
# timeout.sh - sleep for a given time then send an alarm to the specified process
# usage:
#      timeout.sh 3h $$ &
# where:
#      3h -> sleep for three hours
#      $$ -> process to send the alarm to

sleep $1
kill -s SIGALRM $2

for instance.
Then in the cron job have
Code:
#/bin/sh
# sample startup...
mail_error(){
mail -s $1 <<EOF
Timeout error from xyz
EOF
}

trap "mail_error $0" SIGALRM 
timeout.sh 3h $$ &
<the rest of your script>
Or something like that.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Creating shell script to mimic CRON job and execute an external file NotionCommotion Linux - Newbie 19 01-26-2017 04:01 PM
Lightweight HIDS via shell script + cron job. Loony? Gullible Jones Linux - Security 5 06-21-2014 03:37 AM
[SOLVED] Empty email when the script run as a cron job jf.argentino Linux - Server 3 01-07-2013 02:45 AM
Issue in running shell script using cron job in Unix server. Mundlamuri Programming 7 07-12-2011 11:37 AM
shell script fo run auto job in cron JolynnMarie LinuxQuestions.org Member Intro 0 04-28-2004 11:21 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration