LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-08-2019, 08:30 AM   #1
pedropt
Member
 
Registered: Aug 2014
Distribution: Devuan
Posts: 225

Rep: Reputation: Disabled
Search text if some part sequence exists


Hi guys , i dont even know how to write the topic to match what i need .

Here it is , i am writing a definition file with errors messages appears in my web log servers .

Example of def.conf

Quote:
ThinkPHP_RCE /module/action/param1 ${@die(md5(HelloThinkPHP))}
Now in web server log this line could be in many forms but that specific sequence is there , by this i mean :

Quote:
111.111.111.111 "GET /index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"
111.111.111.111 "GET /something//module/action/param1 ${@die(md5(HelloThinkPHP))}

Now , when i put my script searching the definition log , i will use the variable i have in the log , witch is :
"/index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"

and i want the script to identify that this line belongs to "/module/action/param1 ${@die(md5(HelloThinkPHP))}" , and then i will retrieve with awk the variable $1 witch is ThinkPHP_RCE.

How can i do this ?
 
Old 10-09-2019, 03:23 AM   #2
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,160
Blog Entries: 3

Rep: Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061Reputation: 2061
Can you go into a little more detail and give one or two more examples? It sounds like you want to read patterns in from one file and search for them in a second. If that is the case, you might have to escalate to perl to avoid lots of loops in AWK.
 
Old 10-09-2019, 08:58 AM   #3
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,272

Rep: Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918
I am going to be caned for this, but given
Code:
bash-5.0$ cat def.conf
aa foraa
bb forbb
cc forcc
and
Code:
bash-5.0$ cat def.log
111.111.111.111 "yyforcc"
111.111.111.111 "yyforaa"
111.111.111.111 "yyforbb"
111.111.111.111 "yyforxx"
222.222.222.222 "yyforaa"
333.333.333.333 "yyforbb"
then
Code:
bash-5.0$ awk 'FILENAME=="def.conf" {a[i]=$1;b[i]=$2;i++}; FILENAME!="def.conf" {for(i in b) {if(match($2,b[i])>0) {print a[i]; break} else {if(i==length(b)-1) {print "No match"}}}}' def.conf def.log
cc
aa
bb
No match
aa
bb
This reads the def.conf file into two arrays, then processes the def.log file. The use of length() is a gawk extension.

Last edited by allend; 10-09-2019 at 09:03 AM.
 
Old 10-09-2019, 10:44 AM   #4
pedropt
Member
 
Registered: Aug 2014
Distribution: Devuan
Posts: 225

Original Poster
Rep: Reputation: Disabled
Thanks both of you , Allend is almost there , the problem is i can not rely on 2 last characters found , because it is not enough and a lot of false positives will appear .
One of the difficulties here is that is have more text in the variable than on the file that will provid me the output i want .

If i have a file with definitions like :

1 rttrh/456430/ewrewr/88000
2 3907/weewrerw/2332/ertet

and i send the script to search the definition file above with this variable :

blalbalb/rttrh/456430/ewrewr/88000

then i am stuck because nothing will be found .
Another alternative would be the inverse , witch means picking line by line on definitions file and search on the log , this way will work because the variable will be small :

if i search for :
rttrh/456430/ewrewr/88000

in

blalbalb/rttrh/456430/ewrewr/88000

then i will have a positive output , but will waste a lot of resources and time to do it line by line .

Now , one this that will do the job will be removing the text untile first slash , and then search , if nothing found then remove the text until next front slash .
This way will work , but eventually i will do a lot of searches with not result that will increase time to the script .

Last edited by pedropt; 10-09-2019 at 10:51 AM.
 
Old 10-09-2019, 11:15 AM   #5
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,272

Rep: Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918
Quote:
the problem is i can not rely on 2 last characters found
Que?
Quote:
If i have a file with definitions like :
1 rttrh/456430/ewrewr/88000
2 3907/weewrerw/2332/ertet
This moves the goal posts from your original post.
Quote:
and i send the script to search the definition file above with this variable :
blalbalb/rttrh/456430/ewrewr/88000
Excuse me, but where was this in your original post?

Quote:
How can i do this ?
At this forum, you get ideas, not complete solutions.
 
1 members found this post helpful.
Old 10-09-2019, 01:13 PM   #6
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_12{.0|.1}
Posts: 5,221
Blog Entries: 11

Rep: Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171
Quote:
Originally Posted by pedropt View Post
Thanks both of you , Allend is almost there , the problem is i can not rely on 2 last characters found , because it is not enough and a lot of false positives will appear .
What is missing for us is: Precisely how much of the matched string can you rely on?

Can you provide a clear example of a single match pattern from the def file, along with a few lines which should match, and a few which should not match. I have tried to see that from your examples already given but without success.
 
1 members found this post helpful.
Old 10-09-2019, 03:11 PM   #7
pedropt
Member
 
Registered: Aug 2014
Distribution: Devuan
Posts: 225

Original Poster
Rep: Reputation: Disabled
allend , look , i didnt move from my original post , i just give another example .

Quote:
Now , when i put my script searching the definition log , i will use the variable i have in the log , witch is :
"/index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"
Quote:
Now in web server log this line could be in many forms but that specific sequence is there , by this i mean :

Quote:
111.111.111.111 "GET /index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"
The definition file is where it will search will be :
Quote:
ThinkPHP_RCE /module/action/param1 ${@die(md5(HelloThinkPHP))}
is exactly the same as :

Quote:
If i have a file with definitions like :

1 rttrh/456430/ewrewr/88000
2 3907/weewrerw/2332/ertet

and i send the script to search the definition file above with this variable :

blalbalb/rttrh/456430/ewrewr/88000

then i am stuck because nothing will be found .
Definitions is some file where i will store all the variables to be compared with .
The ip address on first post was just an example , of course that i will not send the ip address to grep , i will send only what i need to search .

-------------------------------------------------------------------

astrogeek , you are right , i tought about that before i made my last post .

from this example :
Quote:
/index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}
i can leave out "/index.php" because that name file changes , so i will rely only on
Quote:
/module/action/param1 ${@die(md5(HelloThinkPHP))}
Now what i need is not how to put the 1st line as the second line "by removing everything until the 2nd front slash .
What i need is the fastest way to look into a big file for that combination .
I usually use grep , but for heavy files maybe it would be interesting to use something a little more faster .

However i have here an issue , the problem is that every line is different , and this can not be applied for 1 single case .
i have lines in log like this :
/HNAP1

with is an information disclosure to dlink routers (i believe) , on this case i can not remove until the 1st front slash .
Thinking a little bit better , what i really need is to see if on the beginning of the variable is a file or a directory .

directory = /something
file = /somethiong.php/OTHERSTUFF"

case is file then i will remove until 2nd front slash and use the rest , else use the complete variable .

Basically what i need is a quick way to search .

Last edited by pedropt; 10-09-2019 at 03:54 PM.
 
Old 10-09-2019, 04:34 PM   #8
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_12{.0|.1}
Posts: 5,221
Blog Entries: 11

Rep: Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171
That still does not specify how much you can rely on very precisely.

Question: Can you rely on there always being a string matching /module/action/param1 in every line you want to search for?

Quote:
Originally Posted by pedropt View Post
However i have here an issue , the problem is that every line is different , and this can not be applied for 1 single case .
i have lines in log like this :
/HNAP1
But you have not said what you want to do in these cases. Ignore the line? Search for the line? What?

UPDATE: Think in terms of your thread title:

Quote:
Search text if some part sequence exists
Define for us exactly the part sequence which exists and is used to trigger the search.

Last edited by astrogeek; 10-09-2019 at 04:45 PM. Reason: Updated
 
Old 10-09-2019, 06:07 PM   #9
pedropt
Member
 
Registered: Aug 2014
Distribution: Devuan
Posts: 225

Original Poster
Rep: Reputation: Disabled
Thanks for the reply astrogeek and everyone else here trying to help and trying to understand what the heck do i need .

Well , first let me post here some real log examples that anyone gets on their servers from attempts of exploiting .

Lets call it Server.log
Quote:
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /wp-content/plugins/portabl e-phpmyadmin/wp-pma-mod/index.php
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /HNAP1/
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /prov/aastra.cfg
xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /f4bb336d/admin.php
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /cmdd.php HTTP/1.1"
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /index.php/module/action/param1/${@die(md5(HelloThinkPHP))}
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /App/?content=die(md5(HelloThinkPHP))
xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /editBlackAndWhiteList
xxx.xxx.xxx.xxx - [08/Oct +0100] "GET /0015650000000.cfg
Now , the definition file contains a sequence that it could be equal or not to what i have in log , this will be this way because by default hackers use automated scripts with potential directories , these scripts they use run a list of potential directories .
For me this means that i dont need to write in definitions file every line , i just need to write one line that i will know that they will use for sure , this way i can identify the technique used .

On the above QUOTE ; there are multiple exploitations they have try , but before start digging the definitions file for what they were after , my script 1st must identify what kind of request was made to the server .

From the above Quote what script must search :

Line 1 = /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/
Line 2 = /HNAP1/
Line 3 = /prov/aastra.cfg
Line 4 = /f4bb336d/
Line 5 = Ignore
Line 6 = /module/action/param1/${@die(md5(HelloThinkPHP))}
Line 7 = /App/?content=die(md5(HelloThinkPHP))
Line 8 = /editBlackAndWhiteList
Line 9 = /0015650000000.cfg

How it should do in code :

if last text of variable is a file , and is a .php then remove that text file and search .
If it does not have any file in the beginning or end then search (Line 2)
if after a directory a file text exists but it is not php then search without removing anything .
if it starts with a filename other than php then search all .

Resuming :

- a)Detect if "anything.php" exists in the beginning or at the end of variable and remove it .
- b) Case a) code is true then execute it and search .
- Case a code is false then search

Now what is more important in the code is a fast search .

Last edited by pedropt; 10-09-2019 at 06:09 PM.
 
Old 10-09-2019, 06:58 PM   #10
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_12{.0|.1}
Posts: 5,221
Blog Entries: 11

Rep: Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171Reputation: 3171
Looks a lot like you are reinventing modsecurity...

Your line 5 case seems at odds with your rule "anything.php = remove and search" as stated. How would the script know to ignore it?

What do you want to get as the final output? The lines from the log or simply a count of the matching lines?

How do you intend to use this? Near real time as lines are added to the logs? Once per day/week to extract stats? For reporting purposes or blocking purposes?

There is a lot of relevant info we do not have.

At the very least I think that you have the problem, as stated so far, backwards - instead of searching the logs, mangling the lines then searching the definitions for a match with the mangle, simply search the logs for matches to the second part of the definitions one definition at a time, replace matches, skip others.

That said, I don't think your problem is yet well enough defined as indicated by the line 5 mismatch, and I would suggest looking at a rule set for modsecurity to see what is actually involved in matching common exploits by regular expression.

Last edited by astrogeek; 10-09-2019 at 07:36 PM.
 
Old 10-10-2019, 06:46 AM   #11
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware-current
Posts: 5,272

Rep: Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918Reputation: 1918
Given server.log (taking out the space beteen the l and e in portable in what was posted)
Quote:
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /HNAP1/
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /prov/aastra.cfg
xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /f4bb336d/admin.php
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /cmdd.php HTTP/1.1"
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /index.php/module/action/param1/${@die(md5(HelloThinkPHP))}
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /App/?content=die(md5(HelloThinkPHP))
xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /editBlackAndWhiteList
xxx.xxx.xxx.xxx - [08/Oct +0100] "GET /0015650000000.cfg
and server.conf (escaping the characters used in creating regular expressions)
Quote:
aa /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/
bb /HNAP1/
cc /prov/aastra.cfg
dd /f4bb336d/
ee /module/action/param1/\${@die\(md5\(HelloThinkPHP\)\)}
ff /App/\?content=die\(md5\(HelloThinkPHP\)\)
gg /editBlackAndWhiteList
hh /0015650000000.cfg
and server.awk
Code:
FILENAME=="server.conf" {a[i]=$1;b[i]=$2;i++};
FILENAME!="server.conf" {
  for(i in b) {
    if(match($6,b[i])>0) {
      print a[i] " Found " b[i] " in " $0;
      break}
  };
}
then
Code:
bash-5.0$ awk -f server.awk server.conf server.log
aa Found /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
bb Found /HNAP1/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /HNAP1/
cc Found /prov/aastra.cfg in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /prov/aastra.cfg
dd Found /f4bb336d/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /f4bb336d/admin.php
ee Found /module/action/param1/\${@die\(md5\(HelloThinkPHP\)\)} in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /index.php/module/action/param1/${@die(md5(HelloThinkPHP))}
ff Found /App/\?content=die\(md5\(HelloThinkPHP\)\) in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /App/?content=die(md5(HelloThinkPHP))
gg Found /editBlackAndWhiteList in xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /editBlackAndWhiteList
hh Found /0015650000000.cfg in xxx.xxx.xxx.xxx - [08/Oct +0100] "GET /0015650000000.cfg

Last edited by allend; 10-10-2019 at 06:49 AM.
 
1 members found this post helpful.
Old 10-11-2019, 11:32 AM   #12
pedropt
Member
 
Registered: Aug 2014
Distribution: Devuan
Posts: 225

Original Poster
Rep: Reputation: Disabled
Nice code allend , i will probably have to adjust it and remove the loop .


The comparison will not be directly to server log , before it checks to your code i will remove the

Quote:
xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
to be only in variable :
Quote:
/wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
This will be done ip by ip , this means that i will choose firstly the ip , then from that your code will start to identify what was that ip doing in server , after your code i will add some other code that in case nothing was found in definitions file (server.conf) , then will ask me to add a new line to definitions file for future detection .
Great code indeed , i was not expecting it was so simple to do it .
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
directory exists but no such directory exists yumito Linux - Newbie 3 06-09-2013 02:02 AM
[SOLVED] Search for a pattern and if it exists change some other text on the line KRevotsk Linux - Newbie 14 04-18-2013 02:23 PM
trying to change part of text with sequence for filename Adol Linux - Newbie 6 01-13-2013 11:02 AM
To exists or not to exists, this is the Q. Inbal Linux - Newbie 3 07-18-2006 06:04 AM
SIOCADDRT: File exists SIOCCADDRT: File Exists Failed to bring up eth0. opsraja Linux - Networking 0 01-10-2005 08:29 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:20 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration