Read .log format file and get special character from some lines by shell scripting
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Read .log format file and get special character from some lines by shell scripting
Hi Dear Users,
I want to write a shell script to read from a .log format file and get special characters from it.my log format is like this:
--471ea136-A--
[11/Jul/2013:06:42:08 --0400] Ud6MAH8AAAEAAAn4YBoAAAAK 192.168.153.128 42977 192.168.153.128 80
--471ea136-B--
GET /inssgtz7ieltdSstbw7e/neQhmsdwu7imdb0etet/eT/hsvegbff/EH/niRAvLwGK_L/osLnBWcHRk5oGMI/tmLJFqSww/sSjS6KRJB.html?Settotzeertnl=%27pn+&8nafitm=74LuKUC5t0J&4ttNe=Anmsyusi6&Mf1g-vYqyx=elTTsw&Euoytxp$
--471ea136-F--
HTTP/1.1 501 Method Not Implemented
Allow: TRACE
Connection: close
Content-Type: text/html; charset=iso-8859-1
--471ea136-H--
Message: Access denied with code 406 (phase 2). Pattern match "(^[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+|[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+$)" at ARGS:Settotzeertnl. [file "/usr/local/apach$e/conf/samane_rules/SpiderLabs-owasp-modsecurity-crs-33612c6/base_rules/modsecurity_crs_41_sql_injection_attacks.conf"] [line "64"] [id "981318"] [rev "2"] [msg "SQL Injection Attack: Commo$ Common Injection Testing Detected"] [data "Matched Data: ' found within ARGS:Settotzeertnl: 'pn "] [severity "CRITICAL"] [ver "OWASP_CRS/2.2.7"] [maturity "9"] [accuracy "8"] [tag "OWASP_CRS/WEB$_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"]Apache-Error: [file "http_filters.c"] [line 262] [level 3] Unknown Transfer-Encoding: wria, referer: http://www.hao8.de/0ehnqceo/segfto2z...f/tmdpe2En.avi
Action: Intercepted (phase 2)
Stopwatch: 1373539328722947 9277 (- - -)
Stopwatch2: 1373539328722947 9277; combined=587, p1=24, p2=549, p3=0, p4=0, p5=13, sr=0, sw=1, l=0, gc=0
Producer: ModSecurity for Apache/2.7.2 (http://www.modsecurity.org/).
Server: Apache/2.2.23 (Unix) mod_ssl/2.2.23 OpenSSL/1.0.0-fips DAV/2 PHP/5.4.12
Engine-Mode: "ENABLED"
--471ea136-Z--
and this is just for one log. for each log i want to Get data from B part, and get accuracy value from H part. the main question of me is that how to say to my shell script to get these data from each log. as you can see each log is determined just by A-Z part but how to say go to other log each time?
really thanks
Read .log format file and get special character from some lines by shell scripting
Hi again dear users. I thank you for your reply. The thing is for example in the For loop that you said i want to get the information from B part, the whole string that is after GET part, then search in H part and Get "accuracy" field from it. when get these values pass these data as inputs to other program that has been written in c++ language. for each log that has B to H part this process should be done. It means for each log i should Get these data and send to other program for processing (that is in c++ language as i said).
you could make it 'faster' by increasing the initial i @ (i=2;i<=NF;i++)
and it is probably better to use [ and ] as field separators
That would probably make the fields 'predictable' so no need to go in the loop testing for [accuracy
but with just one data set I can't tell.
But the above should work, perhaps not as fast as it should.
- Read .log format file and get special character from some lines by shell scripting
Hi again dear users and really thank you for your help. You said me to use some code like below for getting accuracy and GET data.
awk '{
if ($1 == "GET")
{
printf $2" "
}
{
/Message:/;
for (i=2;i<=NF;i++)
if ($i == "[accuracy")
{
i++;gsub(/[",\]]/,"",$i);
printf $i;
break
}
}
}' Input.log
Thank you for this suggestion. But the question is how to pass these data to other program that is in C++ language for further processing. I have a c++ code like that the input function is like this:
set input(1,10)
set input(2,5)
set input(3,6)
I want to pass accuracy value to the first statement(i mean set input(1,10).it means instead of 10 it should be the accuracy value(for example 8). for the second statement i get the information of the GET part and regards to that information i assign a value to that for second part and a value for third part. for example i get this part and regards to that i will pass for example 9 to second part and 0 to third part. But i do not know exactly for sending these valuse to c++ program in this manner how would it be possible?
Really thanks for your kind and help
regards
Samaneh berenjian
Read .log format file and get special character from some lines by shell scripting
Hi again dear users,
Actually I send these 3 numbers to a c++ program and do some processing on them to get a result(This is a fuzzy logic program). Do you know I should write a program that get these data that has been extracted from the shell scripting that you write.
I think for identifying long i should write a for loop and in it i introduce accuracy field in it. but i do not actually how would it be?
anyway thank you for your kind and help.
If there is something else that you think it is useful for me to write this program it would be your kind if you help me.
thanks alot
regards
Read .log format file and get special character from some lines by shell scripting
The two others are identifying by myself regards to information that i obtain from GET part. i get it form GET part and i use if - else statement and i say if for example this is "/inssgtz7ieltdSstbw7e/neQhmsdwu7imdb0etet/eT/hsvegbff/EH/niRAvLwGK_L/osLnBWcHRk5oGMI/tmLJFqSww/sSjS6KRJB.html?Settotzeertnl=%27pn+&8nafitm=74LuKUC5t0J&4ttNe=Anmsyusi6&Mf1g-vYqyx=elTTsw&Euoytxp$" assign value 9 to second input and 0 to third input.
sincerely your's
each line having two 'fields'
you can further process that to run your tests on field one....
hmm, lets change that
Code:
awk 'BEGIN{Third=0}{
if ($1 == "GET")
{
Get=$2
if ( GET != "some kind of test , this is just example" )
Get=9
}
{
/Message:/;
for (i=2;i<=NF;i++)
if ($i == "[accuracy")
{
i++;gsub(/[^[:digit:]]/,"",$i);
Acc=$i;
printf "%d %d %d\n",Acc,Get,Third;
break
}
}
}' Input.log > OutPutForCProgInput
fields are re-ordered , so accuracy is 1st, get is 2nd
I included a very dumb test set get 9, and if 3rd is always 0, just added it to the print, but would be better to assign that in BEGIN ( changed in code now )
So now you just need to get your C++ prog to use OutPutForCProgInput file as input
and if you can get it to use stdin,,
[schneidz@hyper ~]$ egrep -A 1 "(-B--|-H--)" samasara.txt | sed s/^.*accuracy/""/ | sed s/\"\].*$/""/
--471ea136-B--
GET /inssgtz7ieltdSstbw7e/neQhmsdwu7imdb0etet/eT/hsvegbff/EH/niRAvLwGK_L/osLnBWcHRk5oGMI/tmLJFqSww/sSjS6KRJB.html?Settotzeertnl=%27pn+&8nafitm=74LuKUC5t0J&4ttNe=Anmsyusi6&Mf1g-vYqyx=elTTsw&Euoytxp$
--
--471ea136-H--
"8
[schneidz@hyper ~]$ egrep -A 1 "(-B--|-H--)" samasara.txt | sed s/^.*accuracy/""/ | sed s/\"\].*$/""/ | /whatever/floats/your/boat.cxx
Read .log format file and get special character from some lines by shell scripting
Hi again dear users,
I just use this command
egrep -A 1 "(-B--|-H--)" samasara.txt | sed s/^.*accuracy/""/ | sed s/\"\].*$/""/ > sama.log
and my log file is like this now(3 logs):
--a58e7514-B--
GET /6ni.mdb?nozjrinYr=iqr&je=su+HYka%26ttAin+ewgety+object&rrofatiosmdereo=71&@gKna=ihw%3B&rrd6t4aeaet=9 &bodyrhPy=rN7eis7k&c3snNns=e4V4qrS%40%408r&1fhmdroyc=y3V%40hSUm&dhDt5al3tts=743978 HTTP/1.0
--
--a58e7514-H--
"8
--
--a58e7514-B--
GET /nZ1f7@k1Et/tZ/p76wc4jsJwd6hXQY/ulTHiisjxea/aaioitpoqmdsjrcettn/hqKkaspJpN.EFWHKzkF/dFqqKKrktH1/n1pttozE3a41/kn/s3aCv2mlLqJw/i133rAKdeV@H0/IE.js?oeas=eu&3cnsycnsa8=%29ttsd&oo6vT=6tajt&li=0reiK$
--
--a58e7514-H--
"8
--
--a58e7514-B--
GET /i_O6L@tzWZbo8mZ_n0I/muleXmWiahoftehi.css?gMqou6Mdaa=d%27ois%5Dh%3Ec+&GWwindow.openg8Y=219&4XAR_=e&T1v._Vh5jtelnetZr=An7% 259%3A%5DHpatvhl HTTP/1.0
--
--a58e7514-H--
"8
after that i try to use some proram like this:
awk 'BEGIN{Third=0}{
if ($1 == "GET")
{
Get=$2
if ( GET != "some kind of test , this is just example" )
Get=9
}
{
/Message:/;
for (i=2;i<=NF;i++)
if ($i == "[accuracy")
{
i++;gsub(/[^[:digit:]]/,"",$i);
Acc=$i;
printf "%d %d %d\n",Acc,Get,Third;
break
}
}
}' sama.log > mycplusplus program
Now the problem is that how can i make this program as a program to check every logs and for each log do something like above. And the other thing is in my c++ program i sould run the script with the command system(./myscript) to pass the data to the variable that are in my code?
Thanks
waiting for your reply
OK, next
if this is your c++ program, why not have it read and process the log files?
it will probably be faster then getting awk to do it.
I can probably learn how to do it in c++,. but you have a head start...
Third
you only need the one awk, no need for grep and sed..
Code:
awk 'BEGIN{Third=0}{
if ($1 == "GET")
{
Get=$2
if ( GET != "some kind of test , this is just example" )
Get=9
}
{
/Message:/;
for (i=2;i<=NF;i++)
if ($i == "[accuracy")
{
i++;gsub(/[^[:digit:]]/,"",$i);
Acc=$i;
printf "%d %d %d\n",Acc,Get,Third;
break
}
}
}' /path/to/yourlogs/*.log | FuzzyLogicProg # assumes it will read stdin taking 3 args from each line
Better way of doing it ( should also be faster )
Save as ParseLog.awk ( You use a name that makes sense )
Code:
#!/usr/bin/awk -f
BEGIN{Third=0}{
if ($1 == "GET") {
Get=$2; if ( GET != "some kind of test , this is just example" )
Get=9
}
if ($1 == "Message:") {
sub(/^Message:.+\[accuracy/,"",$0);
gsub(/[^[:digit:]]/,"",$1);
Acc=$1;
printf "%d %d %d\n",Acc,Get,Third;
}
}
make it executable
and
Code:
./ParseLog.awk /path/to/logs/*.log
Your sample data results in
Code:
8 9 0
how you get that into your program is up to you ( from stdin would be nice )
But please remember, you have only given ONE data set...
You have not defined how you test the 'Get' to end up with 9
or where the third value (0) comes from
So as it stands, the current awk scripts I have posted are useless to you..
UNLESS *YOU* modify them to suit.
And will it matter if you process the same logs over and over?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.