LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-11-2007, 09:54 AM   #1
RaelOM
Member
 
Registered: Dec 2004
Posts: 110

Rep: Reputation: 16
Bash scripting to match ps -o etime


I'm trying to write a script that will scan the process tree for certain set of processes running longer than 12 hours.

Using the following string prints out the processes I'm interested in.

CODE:
for i in `cat glist `; do ps -e -o user,pid,etime,comm | grep $i| grep -v httpd |grep -v sidd ; done


RESULTS:
edihub01 11559 13:27:54 w10_nextHtml.cg
edihub01 12157 13:24:50 w04_doOB.cgi
pos01 14259 1-02:54:51 ver3_ProcTimeIn


Now I need to check against that output and grep, well I'm trying to use grep, those that are older than 12 hours.

Obviously all three are in this case, which is fine so I isolate the etime field:

CODE:
for i in `cat glist `; do ps -e -o user,pid,etime,comm | grep $i| grep -v httpd |grep -v sidd | awk '{print $3}'; done


RESULTS:
13:29:35
13:26:31
1-02:56:32


Now I try to grep within that output using the following...

CODE:
for i in `cat glist `; do ps -e -o user,pid,etime,comm | grep $i| grep -v httpd |grep -v sidd | awk '{print $3}' | grep *[12-23]:[0-9][0-9]:[0-9][0-9]; done

RESULTS:
"Nothing"


I'm trying to setup the grep to say "anything over 12 hours and 00 minutes and 00 seconds" display

What am I doing wrong? Why isn't this working like I expect?
 
Old 10-11-2007, 10:14 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

This: grep *[12-23]:[0-9][0-9]:[0-9][0-9]
Should be: grep [12-23]:[0-9][0-9]:[0-9][0-9] (* is removed).

You could also combine some of the statements:
grep -v httpd |grep -v sidd
becomes:
egrep -v "httpd|sidd"

awk '{print $3}' | grep [12-23]:[0-9][0-9]:[0-9][0-9]
Becomes
awk '/[12-23]:[0-9][0-9]:[0-9][0-9]/ {print $3}'

You'll end up with:

for i in `cat glist `; do ps -e -o user,pid,etime,comm | grep $i| egrep -v "httpd|sidd" | awk '/[12-23]:[0-9][0-9]:[0-9][0-9]/ {print $3}'; done

Hope this helps.
 
Old 10-11-2007, 10:20 AM   #3
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
The [] in regular expressions does not mean integer range, it means a list of characters, e.g. [abc] matches a b or c. The [a-c] is a short way of saying "all the characters between a and c inclusive", and works with letters and numbers. [12-23] therefore matches the single character 1, 2, or 3 (because "2-2" is shortened to simply "2"). The * is to be used after a pattern, not before, and indicates "0 or more occurrences of the previous pattern).

You could split the pattern up a little to help the thinking process.
Code:
1[2-9]:[0-9][0-9]:[0-9][0-9]                   # 12-19h runtime
2[0-3]:[0-9][0-9]:[0-9][0-9]                   # 20-23h runtime
[0-9][0-9]*-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]   # > 1 day runtime
You could provide each of these patterns separately using multiple -e options to grep, and enclosing the patterns in quotes - if you happen to have a file in the working directory whose names matches the pattern you are grepping for, the shell will do the substitution and then pass the file name to grep as the pattern, which can cause nasty unexpected behaviour.
Code:
grep -e '1[2-9]:[0-9][0-9]:[0-9][0-9]' -e '2[0-3]:[0-9][0-9]:[0-9][0-9]' -e '[0-9][0-9]*-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'
You could also consolidate them into a single pattern, using (pattern1|pattern2|pattern3) for the parts which vary, and use egrep, which supports this extended RE type. Personally I prefer using multiple -e options as it is clearer, but if you want to do it all in one, here's how:
Code:
egrep '(1[2-9]|2[0-3]|[0-9][0-9]*-[0-9][0-9]):[0-9][0-9]:[0-9][0-9]'
 
Old 10-11-2007, 10:21 AM   #4
Hobbletoe
Member
 
Registered: Sep 2004
Location: Dayton, Oh
Distribution: Linux Mint 17
Posts: 150

Rep: Reputation: 18
I know why it doesn't work, but don't do enough with regular expressions to tell you how to fix it. You problem is that a regular expression does not understand two digit numbers. You can tell it 12, but it is looking litterly for 1 followed by a 2. It doesn't start at 12 and look for 13, 14, 15, ... it just doesn't work that way. You might want something like ...

Code:
egrep '(1[2-9]|2[0-3])'
That should translate to 12-23, though as I've said, I don't use regular expressions often.
 
Old 10-11-2007, 10:35 AM   #5
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Maybe a little off topic, but if you have a list of process names in glist, you can shorten the ps command as in
Code:
ps --no-headers -C $i -o etime
--no-headers do not print the header line
-C search processes by command name
and in the output part you can put only etime. In this way you don't need the grep -v httpd |grep -v sidd | awk '{print $3}' part.
 
Old 10-11-2007, 10:36 AM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

@matthewg42: You are absolutely correct !! (dumb I overlooked that......).
 
Old 10-11-2007, 11:53 AM   #7
RaelOM
Member
 
Registered: Dec 2004
Posts: 110

Original Poster
Rep: Reputation: 16
Quote:
Originally Posted by matthewg42 View Post
The [] in regular expressions does not mean integer range, it means a list of characters, e.g. [abc] matches a b or c. The [a-c] is a short way of saying "all the characters between a and c inclusive", and works with letters and numbers. [12-23] therefore matches the single character 1, 2, or 3 (because "2-2" is shortened to simply "2"). The * is to be used after a pattern, not before, and indicates "0 or more occurrences of the previous pattern).

You could split the pattern up a little to help the thinking process.
Code:
1[2-9]:[0-9][0-9]:[0-9][0-9]                   # 12-19h runtime
2[0-3]:[0-9][0-9]:[0-9][0-9]                   # 20-23h runtime
[0-9][0-9]*-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]   # > 1 day runtime
You could provide each of these patterns separately using multiple -e options to grep, and enclosing the patterns in quotes - if you happen to have a file in the working directory whose names matches the pattern you are grepping for, the shell will do the substitution and then pass the file name to grep as the pattern, which can cause nasty unexpected behaviour.
Code:
grep -e '1[2-9]:[0-9][0-9]:[0-9][0-9]' -e '2[0-3]:[0-9][0-9]:[0-9][0-9]' -e '[0-9][0-9]*-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'
You could also consolidate them into a single pattern, using (pattern1|pattern2|pattern3) for the parts which vary, and use egrep, which supports this extended RE type. Personally I prefer using multiple -e options as it is clearer, but if you want to do it all in one, here's how:
Code:
egrep '(1[2-9]|2[0-3]|[0-9][0-9]*-[0-9][0-9]):[0-9][0-9]:[0-9][0-9]'


KICK BUTT! Thanks Matt.

Could you do me a favor now and explain that last code segment to me so I can learn to fish?
 
Old 10-11-2007, 12:25 PM   #8
Hobbletoe
Member
 
Registered: Sep 2004
Location: Dayton, Oh
Distribution: Linux Mint 17
Posts: 150

Rep: Reputation: 18
Quote:
Originally Posted by RaelOM View Post
KICK BUTT! Thanks Matt.

Could you do me a favor now and explain that last code segment to me so I can learn to fish?
He kind of spells it out in the first section of code. The only difference is that instead of being three separate statements, he condenses it down to one statement. This was done by taking everything that as dis-similar (the days and hours), and putting them in parenthesis, and everything that was similar (minutes/seconds), and keeping them outside of the parenthesis. The parenthesis creates a group that regexp interprets as "one of these" (these separated by a bar '|'), followed by the rest of the expression.

Hope that makes it a bit clearer.
 
Old 10-11-2007, 03:43 PM   #9
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
More formally, (pattern1|pattern2|pattern3) matches if pattern1 or pattern2 or pattern3 matches. The | is a logical or, the parenthesis group the sub-patterns in the or.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash scripting saje Linux - Newbie 4 08-22-2007 08:03 PM
HP-UX ps etime bujecas Other *NIX 1 11-28-2006 02:43 PM
Bash scripting help arturhawkwing Linux - General 1 08-10-2006 11:54 AM
bash scripting.. kurrupt Programming 3 09-21-2005 12:07 AM
Bash Scripting help jgtg32a Programming 5 09-06-2005 09:38 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:10 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration