ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi guys, I wrote a script which reads through a log and writes the log to an sql file - this occasionally runs very slowly and uses up lots of cpu, and other times doesn't populate the sql, can someone offer some assistance? It runs once every 15 minutes via cron.
just one tip:
it forks a lot of tasks like cat, awk, grep, tail, tr....
instead of: cat /var/log/tacacs.log | grep task_id | grep cmd | awk '{print $3}' | uniq
you can write: awk ' /task_id.*cmd/ { print $3; exit } ' /var/log/tacacs.log
(or /cmd.*task_id/ whichever comes first)
you saved 4 new processes.
So you need to optimize all those chains and substitute with only one awk script.
also you made a double loop on that log file, so inside the for you will grep the log file again several times, you can simplify it by a single awk (or perl, or ...) script:
# pseudo code
awk '
# next if cmd not found
{
# this will automatically store the last values for every time value.
time = $3
month[time] = $1
day[time] = $2
....
}
END {
# print sql script
}
' # end of awk
# execute one single sql
probably we can give better advice if you show us the structure the logfile
On top of pan64's advice I would really re-think using the same variable for your loop as has already been set else where, unless of course your intention is to change the value prior to each loop,
which in itself sounds fraught with danger??
4) Scripting With Style
(e.g. Effective use of whitespace and using lowercase names for your variables.)
5) QUOTE ALL OF YOUR VARIABLE EXPANSIONS. You should never leave the quotes off a parameter expansion unless you explicitly want the resulting string to be word-split by the shell and possible globbing patterns expanded. This is a vitally important concept in scripting, so train yourself to do it correctly now. You can learn about the exceptions later.
6) When using bash or ksh, it's recommended to use [[..]] for string/file tests, and ((..)) for numerical tests. Avoid using the old [..] test unless you specifically need POSIX-style portability.
( In short, use external commands like grep and awk when you need to operate on whole files or large text blocks at once, such as when extracting text strings for later use. But once you have those strings stored in variables, it's almost always more efficient to use built-ins to process them. )
Couple of lines from my log, one from a firewall and one from a script (notice the difference):
Mar 20 09:14:47 1.2.3.4 user tty1 1.2.3.4 stop task_id=514638 timezone=gmt service=shell start_time=1363770887 priv-lvl=1 cmd=show env all <cr>
Mar 20 06:00:25 4.3.2.1 user2 22 4.3.2.1 stop task_id=81 cmd=copy /noconfirm running-config tftp://127.0.0.1/4501_ConfigFile.txt service=shell elapsed_time=0
I realised why cpu was overly high though, the logs weren't rotating properly so my script was reading 80mb worth of logs rather than 2mb. I could still do with some help optimizing though. I'll read through some of the comments, although my adaptions have not worked :/... Wrong fields or no fields output for example.
Last edited by genderbender; 03-20-2013 at 07:19 AM.
yes, that awk will stop at the first line. Probably that is not what you want. You can try this: awk ' /task_id.*cmd/ { print $3 } ' /var/log/tacacs.log | uniq
or even better you can implement uniq in awk: awk ' /task_id.*cmd/ { times[$3] } END { for ( key in times ) { print times[key] "\n" } ' /var/log/tacacs.log
set will not return anything but set $1, $2 .... $7 for you. So after that line you can use the variables $1, $2 ... as month, day...
Good grief... what people can manage to do with a shell-script!
Use a real programming-language, designed for this purpose. There are lots to choose from. Even PHP can be used for scripting.
The first line of your script, the so-called #!shebang, which will specify what command-processor should be used to execute it. (Do you, say, know PHP? Then, use that. You can do that, you know ...)
Your script is built in an incomprehensible inefficient way, launching an instance of the mysql process, with unlimited access to the database, to insert every single line.
No wonder your computer gets mad at you. I'm surprised it hasn't removed itself from the rack and skipped town.
"PHP is a server-side scripting language designed for Web development". This isn't web development, just needs to read from a text file and put the contents in a database. Perl was an option, but bash was quicker for me. There were options to write directly to a database which I chose not to do when implementing tacacs, I also chose (possibly unwisely) to use root as my user, but then again there's just one database on the system, not multiple so I don't have much concern. For all your rubbishing you've been zero help, perhaps positivity and ideas is a better idea than going "why not just write it in something else". PHP isn't exactly a good programming language anyway, that being said I wouldn't dream of going to a php forum and going "why not use C++ instead or an OO language you fools". I'd also like to add - this server has a small footprint and doesn't have php installed... I reckon calling php would use more memory than bash (I might be wrong here though).
Last edited by genderbender; 03-20-2013 at 09:45 AM.
I'm getting some quote odd results now, some of the fields do not complete when using this one liner... Any ideas? I think the search for task_id is not working or something and results are returned where nothing has been actioned. E.g I've got results such as:
Wed Mar 20 16:37:56 2013 [1234]: connect from 123.123.123.123 [123.123.123.123]
Last edited by genderbender; 03-20-2013 at 11:42 AM.
So far only sundialsvcs has seized upon the correct answer. The original question reads like: 'I am building a house using a butterknife, a large flat rock and a couple of knitting needles, and it seems to be taking forever. How can I speed up the process?'
Using the correct tools for the job is the essential first step in optimization. A compiled application with a properly designed database and database access methods seems to be a much better approach. Even a fast scripting language (singular) like Perl with a good library for DBMS access will probably result in adequate speed and CPU usage without doing any database optimization.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.