LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-19-2013, 04:50 AM   #1
genderbender
Member
 
Registered: Jan 2005
Location: US
Distribution: Centos, Ubuntu, Solaris, Redhat
Posts: 396

Rep: Reputation: 31
My script uses lots of cpu


Hi guys, I wrote a script which reads through a log and writes the log to an sql file - this occasionally runs very slowly and uses up lots of cpu, and other times doesn't populate the sql, can someone offer some assistance? It runs once every 15 minutes via cron.

Code:
#!/bin/bash

LOG=/var/log/tacacs.log
PIDFILE=/var/run/tac2sql.pid
TIME=`cat /var/log/tacacs.log | grep task_id | grep cmd | awk '{print $3}' | uniq`

if [ -e /var/run/tac2sql.pid ]; then
        echo "pid file exists, exiting" >> /var/log/tacacs.log
else
        touch /var/run/tac2sql.pid
for TIME in `echo $TIME`;
do
        month=`grep $TIME $LOG | grep cmd | awk '{print $1}' | tail -n 1`
        day=`grep $TIME $LOG | grep cmd | awk '{print $2}' | tail -n 1`
        destination_ip=`grep $TIME $LOG | grep cmd | awk '{print $4}' | tail -n 1`
        user=`grep $TIME $LOG | grep cmd |  awk '{print $5}' | tail -n 1`
        source_ip=`grep $TIME $LOG | grep cmd | awk '{print $7}' | tail -n 1`
        command=`grep $TIME $LOG | awk 'gsub(/.*cmd=| <cr>.*/,"")' | tail -n 1`
        firewall_test=`echo $command | grep -o "service=shell" | wc -l`
        if [ $firewall_test -eq 1 ]; then
                command=`grep $TIME $LOG | awk 'gsub(/.*cmd=| service.*/,"")' | grep 'service=shell' | sed 's/service=shell.*//'`
        fi
        task_id=`grep $TIME $LOG | grep cmd | awk '{print $9}' | grep -o "[0-9]" | tail -n 1`
        task_number_time=`echo $user$task_id$TIME | tr -d ":" | tail -n 1`
        query1="SELECT COUNT(1) FROM tacacs.tacacs_log WHERE task_number_time = '$task_number_time';"
        RCOUNT=`mysql -u root -p!PASSWORD-s -e "$query1"`
        if [ $RCOUNT -eq 0 ]; then
                query="INSERT INTO tacacs.tacacs_log(month, day, time, username, source_ip, destination_ip, command, task_number_time) VALUES('$month', '$day', '$TIME', '$user', '$source_ip', '$destination_ip', '$command', '$task_number_time');"
                mysql -u root -p!PASSWORD -s -e "$query" &> /dev/null
        fi
done
rm -f /var/run/tac2sql.pid
fi
 
Old 03-19-2013, 05:49 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,804

Rep: Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306
just one tip:
it forks a lot of tasks like cat, awk, grep, tail, tr....
instead of:
cat /var/log/tacacs.log | grep task_id | grep cmd | awk '{print $3}' | uniq
you can write:
awk ' /task_id.*cmd/ { print $3; exit } ' /var/log/tacacs.log
(or /cmd.*task_id/ whichever comes first)

you saved 4 new processes.
So you need to optimize all those chains and substitute with only one awk script.
 
1 members found this post helpful.
Old 03-19-2013, 07:05 AM   #3
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,804

Rep: Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306
also you made a double loop on that log file, so inside the for you will grep the log file again several times, you can simplify it by a single awk (or perl, or ...) script:
# pseudo code
awk '
# next if cmd not found
{
# this will automatically store the last values for every time value.
time = $3
month[time] = $1
day[time] = $2
....
}
END {
# print sql script
}
' # end of awk
# execute one single sql


probably we can give better advice if you show us the structure the logfile
 
1 members found this post helpful.
Old 03-19-2013, 10:50 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
On top of pan64's advice I would really re-think using the same variable for your loop as has already been set else where, unless of course your intention is to change the value prior to each loop,
which in itself sounds fraught with danger??
Code:
for TIME in `echo $TIME`
 
1 members found this post helpful.
Old 03-19-2013, 01:30 PM   #5
gnashley
Amigo developer
 
Registered: Dec 2003
Location: Germany
Distribution: Slackware
Posts: 4,928

Rep: Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612
I counted 39 calls to en external program for each loop. Something like this eliminates a bunch of them:
Code:
set -- `grep $TIME $LOG | grep cmd | tail -n 1` 
month=$1
day=$2
destination_ip=$4
user=$5
source_ip=$7
 
1 members found this post helpful.
Old 03-20-2013, 06:59 AM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Here are a few more things you should work to avoid or correct:


1) Don't Read Lines With For

2) Useless Use Of Cat (and grep, etc)

3) $(..) is highly recommended over `..`

4) Scripting With Style
(e.g. Effective use of whitespace and using lowercase names for your variables.)

5) QUOTE ALL OF YOUR VARIABLE EXPANSIONS. You should never leave the quotes off a parameter expansion unless you explicitly want the resulting string to be word-split by the shell and possible globbing patterns expanded. This is a vitally important concept in scripting, so train yourself to do it correctly now. You can learn about the exceptions later.

http://mywiki.wooledge.org/Arguments
http://mywiki.wooledge.org/WordSplitting
http://mywiki.wooledge.org/Quotes

6) When using bash or ksh, it's recommended to use [[..]] for string/file tests, and ((..)) for numerical tests. Avoid using the old [..] test unless you specifically need POSIX-style portability.

http://mywiki.wooledge.org/BashFAQ/031
http://mywiki.wooledge.org/ArithmeticExpression

7) Don't use single, scalar variables when you have lists of things. Always use arrays when you have multiple related values to process.


8) And look here for various ways to replace external commands with shell built-ins:

string manipulations in bash

( In short, use external commands like grep and awk when you need to operate on whole files or large text blocks at once, such as when extracting text strings for later use. But once you have those strings stored in variables, it's almost always more efficient to use built-ins to process them. )
 
1 members found this post helpful.
Old 03-20-2013, 07:14 AM   #7
genderbender
Member
 
Registered: Jan 2005
Location: US
Distribution: Centos, Ubuntu, Solaris, Redhat
Posts: 396

Original Poster
Rep: Reputation: 31
Couple of lines from my log, one from a firewall and one from a script (notice the difference):

Mar 20 09:14:47 1.2.3.4 user tty1 1.2.3.4 stop task_id=514638 timezone=gmt service=shell start_time=1363770887 priv-lvl=1 cmd=show env all <cr>
Mar 20 06:00:25 4.3.2.1 user2 22 4.3.2.1 stop task_id=81 cmd=copy /noconfirm running-config tftp://127.0.0.1/4501_ConfigFile.txt service=shell elapsed_time=0

I realised why cpu was overly high though, the logs weren't rotating properly so my script was reading 80mb worth of logs rather than 2mb. I could still do with some help optimizing though. I'll read through some of the comments, although my adaptions have not worked :/... Wrong fields or no fields output for example.

Last edited by genderbender; 03-20-2013 at 07:19 AM.
 
Old 03-20-2013, 07:56 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,804

Rep: Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306
we will gladly help you to fix those problems, just show us what you have tried (and what went wrong)
 
1 members found this post helpful.
Old 03-20-2013, 09:07 AM   #9
genderbender
Member
 
Registered: Jan 2005
Location: US
Distribution: Centos, Ubuntu, Solaris, Redhat
Posts: 396

Original Poster
Rep: Reputation: 31
Only one time is echoed when it should read line by line searching for that specific time:
Quote:
awk ' /task_id.*cmd/ { print $3; exit } ' /var/log/tacacs.log
I've never used set, but the commands after set return nothing...
Quote:
set -- `grep $TIME $LOG | grep cmd | tail -n 1`
Shall I supply some code to go with this? Thanks for your help by the way?
 
Old 03-20-2013, 09:19 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,804

Rep: Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306Reputation: 7306
yes, that awk will stop at the first line. Probably that is not what you want. You can try this:
awk ' /task_id.*cmd/ { print $3 } ' /var/log/tacacs.log | uniq
or even better you can implement uniq in awk:
awk ' /task_id.*cmd/ { times[$3] } END { for ( key in times ) { print times[key] "\n" } ' /var/log/tacacs.log

set will not return anything but set $1, $2 .... $7 for you. So after that line you can use the variables $1, $2 ... as month, day...
 
1 members found this post helpful.
Old 03-20-2013, 09:30 AM   #11
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,649
Blog Entries: 4

Rep: Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934Reputation: 3934
Good grief... what people can manage to do with a shell-script!

Use a real programming-language, designed for this purpose. There are lots to choose from. Even PHP can be used for scripting.

The first line of your script, the so-called #!shebang, which will specify what command-processor should be used to execute it. (Do you, say, know PHP? Then, use that. You can do that, you know ...)

Your script is built in an incomprehensible inefficient way, launching an instance of the mysql process, with unlimited access to the database, to insert every single line.

No wonder your computer gets mad at you. I'm surprised it hasn't removed itself from the rack and skipped town.
 
Old 03-20-2013, 09:40 AM   #12
genderbender
Member
 
Registered: Jan 2005
Location: US
Distribution: Centos, Ubuntu, Solaris, Redhat
Posts: 396

Original Poster
Rep: Reputation: 31
To quote wikipedia:

"PHP is a server-side scripting language designed for Web development". This isn't web development, just needs to read from a text file and put the contents in a database. Perl was an option, but bash was quicker for me. There were options to write directly to a database which I chose not to do when implementing tacacs, I also chose (possibly unwisely) to use root as my user, but then again there's just one database on the system, not multiple so I don't have much concern. For all your rubbishing you've been zero help, perhaps positivity and ideas is a better idea than going "why not just write it in something else". PHP isn't exactly a good programming language anyway, that being said I wouldn't dream of going to a php forum and going "why not use C++ instead or an OO language you fools". I'd also like to add - this server has a small footprint and doesn't have php installed... I reckon calling php would use more memory than bash (I might be wrong here though).

Last edited by genderbender; 03-20-2013 at 09:45 AM.
 
Old 03-20-2013, 10:13 AM   #13
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,005

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Actually the uniq idea in awk can be a little simpler:
Code:
awk '/task_id.*cmd/ && ! _[$3]++{print $3}' /var/log/tacacs.log
 
1 members found this post helpful.
Old 03-20-2013, 11:40 AM   #14
genderbender
Member
 
Registered: Jan 2005
Location: US
Distribution: Centos, Ubuntu, Solaris, Redhat
Posts: 396

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by grail View Post
Actually the uniq idea in awk can be a little simpler:
Code:
awk '/task_id.*cmd/ && ! _[$3]++{print $3}' /var/log/tacacs.log
I'm getting some quote odd results now, some of the fields do not complete when using this one liner... Any ideas? I think the search for task_id is not working or something and results are returned where nothing has been actioned. E.g I've got results such as:

Wed Mar 20 16:37:56 2013 [1234]: connect from 123.123.123.123 [123.123.123.123]

Last edited by genderbender; 03-20-2013 at 11:42 AM.
 
Old 03-20-2013, 11:55 AM   #15
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
So far only sundialsvcs has seized upon the correct answer. The original question reads like: 'I am building a house using a butterknife, a large flat rock and a couple of knitting needles, and it seems to be taking forever. How can I speed up the process?'
Using the correct tools for the job is the essential first step in optimization. A compiled application with a properly designed database and database access methods seems to be a much better approach. Even a fast scripting language (singular) like Perl with a good library for DBMS access will probably result in adequate speed and CPU usage without doing any database optimization.

--- rod.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Courier IMAPD taking lots of server CPU copperkin Linux - Server 1 02-16-2011 06:48 PM
LXer: Parted Magic 4.9 - Xorg eating lots of CPU on my Intel 830m system LXer Syndicated Linux News 0 03-31-2010 02:00 PM
ntsf.mount process using lots of cpu peteyperson Linux - Newbie 5 06-22-2009 08:03 PM
mysql-mythbackend using lots of cpu time garyg007 Linux - Software 1 10-04-2008 11:20 AM
mySQL 100% CPU, how to config my.cnf for lots of querys sorigen Linux - Server 2 10-01-2007 12:58 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:12 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration