My script uses lots of cpu
Hi guys, I wrote a script which reads through a log and writes the log to an sql file - this occasionally runs very slowly and uses up lots of cpu, and other times doesn't populate the sql, can someone offer some assistance? It runs once every 15 minutes via cron.
Code:
#!/bin/bash |
just one tip:
it forks a lot of tasks like cat, awk, grep, tail, tr.... instead of: cat /var/log/tacacs.log | grep task_id | grep cmd | awk '{print $3}' | uniq you can write: awk ' /task_id.*cmd/ { print $3; exit } ' /var/log/tacacs.log (or /cmd.*task_id/ whichever comes first) you saved 4 new processes. So you need to optimize all those chains and substitute with only one awk script. |
also you made a double loop on that log file, so inside the for you will grep the log file again several times, you can simplify it by a single awk (or perl, or ...) script:
# pseudo code awk ' # next if cmd not found { # this will automatically store the last values for every time value. time = $3 month[time] = $1 day[time] = $2 .... } END { # print sql script } ' # end of awk # execute one single sql probably we can give better advice if you show us the structure the logfile |
On top of pan64's advice I would really re-think using the same variable for your loop as has already been set else where, unless of course your intention is to change the value prior to each loop,
which in itself sounds fraught with danger?? Code:
for TIME in `echo $TIME` |
I counted 39 calls to en external program for each loop. Something like this eliminates a bunch of them:
Code:
set -- `grep $TIME $LOG | grep cmd | tail -n 1` |
Here are a few more things you should work to avoid or correct:
1) Don't Read Lines With For 2) Useless Use Of Cat (and grep, etc) 3) $(..) is highly recommended over `..` 4) Scripting With Style (e.g. Effective use of whitespace and using lowercase names for your variables.) 5) QUOTE ALL OF YOUR VARIABLE EXPANSIONS. You should never leave the quotes off a parameter expansion unless you explicitly want the resulting string to be word-split by the shell and possible globbing patterns expanded. This is a vitally important concept in scripting, so train yourself to do it correctly now. You can learn about the exceptions later. http://mywiki.wooledge.org/Arguments http://mywiki.wooledge.org/WordSplitting http://mywiki.wooledge.org/Quotes 6) When using bash or ksh, it's recommended to use [[..]] for string/file tests, and ((..)) for numerical tests. Avoid using the old [..] test unless you specifically need POSIX-style portability. http://mywiki.wooledge.org/BashFAQ/031 http://mywiki.wooledge.org/ArithmeticExpression 7) Don't use single, scalar variables when you have lists of things. Always use arrays when you have multiple related values to process. 8) And look here for various ways to replace external commands with shell built-ins: string manipulations in bash ( In short, use external commands like grep and awk when you need to operate on whole files or large text blocks at once, such as when extracting text strings for later use. But once you have those strings stored in variables, it's almost always more efficient to use built-ins to process them. ) |
Couple of lines from my log, one from a firewall and one from a script (notice the difference):
Mar 20 09:14:47 1.2.3.4 user tty1 1.2.3.4 stop task_id=514638 timezone=gmt service=shell start_time=1363770887 priv-lvl=1 cmd=show env all <cr> Mar 20 06:00:25 4.3.2.1 user2 22 4.3.2.1 stop task_id=81 cmd=copy /noconfirm running-config tftp://127.0.0.1/4501_ConfigFile.txt service=shell elapsed_time=0 I realised why cpu was overly high though, the logs weren't rotating properly so my script was reading 80mb worth of logs rather than 2mb. I could still do with some help optimizing though. I'll read through some of the comments, although my adaptions have not worked :/... Wrong fields or no fields output for example. |
we will gladly help you to fix those problems, just show us what you have tried (and what went wrong)
|
Only one time is echoed when it should read line by line searching for that specific time:
Quote:
Quote:
|
yes, that awk will stop at the first line. Probably that is not what you want. You can try this:
awk ' /task_id.*cmd/ { print $3 } ' /var/log/tacacs.log | uniq or even better you can implement uniq in awk: awk ' /task_id.*cmd/ { times[$3] } END { for ( key in times ) { print times[key] "\n" } ' /var/log/tacacs.log set will not return anything but set $1, $2 .... $7 for you. So after that line you can use the variables $1, $2 ... as month, day... |
Good grief... what people can manage to do with a shell-script! :eek:
Use a real programming-language, designed for this purpose. There are lots to choose from. Even PHP can be used for scripting. The first line of your script, the so-called #!shebang, which will specify what command-processor should be used to execute it. (Do you, say, know PHP? Then, use that. You can do that, you know ...) Your script is built in an incomprehensible inefficient way, launching an instance of the mysql process, with :eek: unlimited :eek: access to the database, to insert every single line. No wonder your computer gets mad at you. I'm surprised it hasn't removed itself from the rack and skipped town. ;) |
To quote wikipedia:
"PHP is a server-side scripting language designed for Web development". This isn't web development, just needs to read from a text file and put the contents in a database. Perl was an option, but bash was quicker for me. There were options to write directly to a database which I chose not to do when implementing tacacs, I also chose (possibly unwisely) to use root as my user, but then again there's just one database on the system, not multiple so I don't have much concern. For all your rubbishing you've been zero help, perhaps positivity and ideas is a better idea than going "why not just write it in something else". PHP isn't exactly a good programming language anyway, that being said I wouldn't dream of going to a php forum and going "why not use C++ instead or an OO language you fools". I'd also like to add - this server has a small footprint and doesn't have php installed... I reckon calling php would use more memory than bash (I might be wrong here though). |
Actually the uniq idea in awk can be a little simpler:
Code:
awk '/task_id.*cmd/ && ! _[$3]++{print $3}' /var/log/tacacs.log |
Quote:
Wed Mar 20 16:37:56 2013 [1234]: connect from 123.123.123.123 [123.123.123.123] |
So far only sundialsvcs has seized upon the correct answer. The original question reads like: 'I am building a house using a butterknife, a large flat rock and a couple of knitting needles, and it seems to be taking forever. How can I speed up the process?'
Using the correct tools for the job is the essential first step in optimization. A compiled application with a properly designed database and database access methods seems to be a much better approach. Even a fast scripting language (singular) like Perl with a good library for DBMS access will probably result in adequate speed and CPU usage without doing any database optimization. --- rod. |
All times are GMT -5. The time now is 08:20 PM. |