ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
This is a different flow
it checks every log line and reports the exploit + the IP
no cats echos seds awks
just bash ( and printf )
Code:
#!/bin/bash
unset Patterns
while read -a RE
do
declare -A Patterns[${RE[0]}]+="${RE[1]}|"
done < definition.conf
# definition.conf "exploit_type regular_expression"
# NOTE they are real regular_expressions not strings !
# Patterns is an associative array, indexed by exploit_type
# regular_expression are appended, with "or" |
# the trailing | is stripped later ${Patterns[$i]%|}
# One day I'll think of a better way :D
# LogFormat "%h\t%l\t%u\t%t\t\"%r\"\t%>s\t%b" common
# should also fix the awful default date format ;)
while IFS=$'\t' read -r -a Log_line
# [0]="IP" [1]="RFC 1413 aka junk" [2]="userid" [3]="date/time" [4]="request line"
# [5]="server status code" [6]="object size"
do
# ${!Patterns[@]} produces a list of the indexes (indices?)
for exploit in ${!Patterns[@]}
do
[[ ${Log_line[4]} =~ ${Patterns[${exploit}]%|} ]] \
&& printf "Exploit %s detected via IP %s\n" \
${exploit} \
${Log_line[0]} \
&& break # we break as no need to carry on checking
done
done < serverlog.tmp
I hope that inspires you
edit
and this will likely blow your mind
Code:
done < <( tail -F serverlog.tmp )
yes, it will now run forever........
edit2
cleaner appending of the REs
Code:
unset Patterns
while read -a RE
do
[[ ${#Patterns[${RE[0]}]} == 0 ]] \
&& declare -A Patterns[${RE[0]}]+="${RE[1]}" \
|| declare -A Patterns[${RE[0]}]+="|${RE[1]}"
done < definition.conf
# definition.conf "exploit_type regular_expression"
# Patterns is an associative array, indexed by exploit_type
# regular_expression are appended, prefixed | ( RE or )
<snip>
# no need to remove trailing | , we don't have that anymore
#[[ ${Log_line[4]} =~ ${Patterns[${exploit}]%|} ]]
[[ ${Log_line[4]} =~ ${Patterns[${exploit}]} ]] \
I haven't bothered to look what this code does, but want to take the opportunity to demonstrate some capabilities of the shell that you don't find in books:
Code:
grep -h "404" $loghttp $loghttps |
cut -d ':' -f2- | awk '{print$1}' |
if [[ -z "$shd" ]]
then
sort -u
else
sort -u | sed -e "/$shd/d"
fi > iptemp2.tmp
if,while,until,for,case start code blocks where you can redirect stdin,stdout,stderr. The only special thing is that a pipe | to a code block forces it into a sub shell.
You can also have an explicit code block in braces { } where the syntax requires the closing } to occur after a newline or a semicolon.
they are filtering for 404 , cutting and awking the cut, presumably for the IP
Code:
awk '/404/{sub(/:.*/,"",$2);print $2}'
the field number might be incorrect, I really haven't bothered looking at what that sloppy code is doing
but it is completely redundant anyway
I can quickly hack in a 404 filter
Code:
while IFS=$'\t' read -r -a Log_line
# [0]="IP" [1]="RFC 1413 aka junk" [2]="userid" [3]="date/time" [4]="request line"
# [5]="server status code" [6]="object size"
do
[[ ${Log_line[5]} != 404 ]] && continue
for exploit in ${!Patterns[@]}
do
[[ ${Log_line[4]} =~ ${Patterns[${exploit}]%|} ]] \
&& printf "Exploit %s detected via IP %s\n" \
${exploit} \
${Log_line[0]} \
&& break # we break as no need to carry on checking
done
or have a case
Code:
case ${Log_line[4]} in
404) dummy=yes;;
*) continue;;
esac
Last edited by Firerat; 10-21-2019 at 02:50 AM.
Reason: ${Log_line[5]} not ${Log_line[4]} from my copypasta
This is probably easier to understand
I have removed one of the arrays in favour of vars
However, an array would make more sense if you have different logs with differing numbers of fields
Code:
#!/bin/bash
unset Patterns
while read -a RE
do
[[ ${#Patterns[${RE[0]}]} == 0 ]] \
&& declare -A Patterns[${RE[0]}]+="${RE[1]}" \
|| declare -A Patterns[${RE[0]}]+="|${RE[1]}"
done < definition.conf
# definition.conf "exploit_type regular_expression"
# Patterns is an associative array, indexed by exploit_type
# regular_expression are appended, prefixed | ( RE or )
ExploitCheck () {
for exploit in ${!Patterns[@]}
do
[[ ${Request} =~ ${Patterns[$exploit]} ]] \
&& printf "Exploit %s detected via IP %s\n" \
${exploit} \
${IP} \
&& break
done
}
# LogFormat "%h\t%l\t%u\t%t\t\"%r\"\t%>s\t%b" common
# should also fix the awful default date format ;)
while IFS=$'\t' read IP RFC1413 UserID Time Request StatusCode ObjectSize
do
case ${StatusCode} in
404) ExploitCheck
;;
esac
done < serverlog.tmp
#done < <( tail -F serverlog.tmp )
I haven't bothered to look what this code does, but want to take the opportunity to demonstrate some capabilities of the shell that you don't find in books:
grep -h "404" $loghttp $loghttps |
cut -d ':' -f2- | awk '{print$1}' |
if [[ -z "$shd" ]]
then
sort -u
else
sort -u | sed -e "/$shd/d"
fi > iptemp2.tmp
It does not work properly , despite the fact that it should , instead grabbing an ip i am grabbing date ?!!!
Anyway , don't bother with this one , i like it the way it is originally .
Basically you both changed the code at your own perspective that right now some codes dont work , or does anything differently .
Well , i figured out one thing .
from original code
Code:
intdet () {
IP=$1
while read LOG
do
for P in "${Patterns[@]}"
do
[[ ${LOG} =~ "${P#* }" ]] && intr=${P% *}
done
done < ipres
return $?
}
Patterns=()
while read foo
do
Patterns+=("${foo}")
done < /def/definition.conf
while read IP
do
intdet $varsel
done <<< ${varsel}
i only need
Code:
intdet () {
while read LOG
do
for P in "${Patterns[@]}"
do
[[ ${LOG} =~ "${P#* }" ]] && intr=${P% *}
done
done < ipres
return $?
}
Patterns=()
while read foo
do
Patterns+=("${foo}")
done < /def/definition.conf
intdet
Anyway , please dont bother anymore , the original thread question was solved , this is just an expansion of it that it never should happen .
Thank you all
By the way , i fixed the issue , i just added an extra line to definition.conf with jerbish stuff , and now the read command reads the line 25 . Some how it does not read last line , so i always have to add something in last line that it does not match with anything at all .
By the way , i fixed the issue , i just added an extra line to definition.conf with jerbish stuff , and now the read command reads the line 25 . Some how it does not read last line , so i always have to add something in last line that it does not match with anything at all .
This is probably easier to understand
I have removed one of the arrays in favour of vars
However, an array would make more sense if you have different logs with differing numbers of fields
as a way of explaining ( justifying ) why I favour array over list of vars
It makes checking easier
Code:
#!/bin/bash
list_one () {
cat << EOF
zero one two three four
zero one two three four
EOF
}
list_two () {
cat << EOF
zero one two three
zero one two three four
zero one two three four five
EOF
}
Vars () {
echo Vars $1
LIST=$1
while read zero one two three four
do
printf "%s_%s_%s_%s_%s\n" zero=$zero one=$one two=$two three=$three four=$four
done < <( $LIST )
echo
}
Array () {
echo Array $1
LIST=$1
while read -a list
do
printf "_%s_%s_%s_%s_%s" zero=${list[0]} one=${list=[1]} two=${list[2]} three=${list[3]} four=${list[4]}
[[ ${#list[@]} != 5 ]] && printf "\t%s\n" "Warning fields incorrect" || printf "\n"
done < <( $LIST )
echo
}
Vars list_one
Vars list_two
Array list_one
Array list_two
you can of course check each $zero $one ... $four is not empty
but simply counting number of array elements is easier
and if you are routinely looking at two logs, one with 5 fields and one with 7 you can count elements and act accordingly ( you should also confirm the extra fields are what is expected and not corruption of the input data )
I will probably do some tests on all code you posted here , but for now the first approach you did (as i explained in last post) is working like a charm , i have to admit , you did a very good approach on it .
The fact that when it is reading i will always have in advance on definitions file a last line that does not have any influence in the matches just to make it work , it does not have any problem , because if i need to add any new definition to the file then what i have to do is remove last line with sed and then place the new definition and the extra line .
Everything is fine , more you simplify the code or different approaches then it is worst for me to understand it in future , i don't write bash everyday , some times takes months before i write something in bash , and by this it will mean that i will forget all your more complex code sequence and how it works .
If i do it simple and use your first approach then when i open this script to pick up some code to another script then i will understand how it is done .
One thing you can be sure , despite that i am gratefully for your work on this thread , i will grab all the code you wrote to a file here , and maybe when i get some time i will try your different approaches in debug mode , this way i will see what the code is doing .
by the way , i had to do some adjustments to this script and find an alternative way in case this routine does not find a match , even if it exists .
What this new routine does is count the number of characters on the line to search , and then uses only a fragment to look into the patterns .
In case this new routine does not work then run the code presented by you guys here .
I created a new script just to do this job , instead adding the code to my main script .
Here it is the code now (i am still doing some adjustments according to my needs or to each case presented)
I had to do this new adjustment because it was not capturing some complicated lines with special characters on then
Code:
#!/bin/bash
#file to check patterns
comp=$1
#Definition File
defile=$2
idptrn () {
while read LOG
do
for P in "${Patterns[@]}"
do
[[ ${LOG} =~ ${P#* } ]] && echo ${P% *}
#echo "$LOG -> $P"
done
done < "$comp"
#echo "executed $0 $comp $defile"
return $?
}
splt() {
if [[ "$cnt" -gt "15" && "$cnt" -lt "30" ]]
then
gtpt=$(echo "$rdnmb" | cut -c1-8)
a1=$(grep "$gtpt" "$defile" | awk '{print$1}')
if [[ ! -z "$a1" ]]
then
echo "$a1"
exit 0
fi
elif [[ "$cnt" -gt "31" ]]
then
gtpt=$(echo "$rdnmb" | cut -c1-15 )
a1=$(grep "$gtpt" "$defile" | awk '{print$1}')
if [[ ! -z "$a1" ]]
then
echo "$a1"
exit 0
fi
fi
splt
done
}
help () {
clear
echo "Pattern ID 1.0"
echo "---------------------------"
echo "example : $0 file_to_search file_with_patterns"
echo "example : $0 /mylog /definition.conf"
echo "----------------------------"
exit 0
}
if [[ -z "$comp" ]]
then
help
elif [[ ! -f "$comp" ]]
then
echo "File to compare not found in : $comp"
exit 0
elif [[ -z "$defile" ]]
then
help
elif [[ ! -f "$defile" ]]
then
echo "Definition file not found in : $defile"
exit 0
fi
nwrtn
Patterns=()
while read foo
do
Patterns+=("${foo}")
done < "$defile"
idptrn
Wow, I think this problem was made more difficult that it needed to be. Assuming the requirements are the same as stated in the OP, then the following should do what you need.
Code:
#!/usr/bin/env bash
set -eo pipefail
IFS=$'\n\t'
definitionFile='definitions.conf'
serverFile='server.log'
# find all ip addresses in the server log into an array.
mapfile -t ips < <(grep -Po '\d{1,3}(?:\.\d{1,3}){3}' "$serverFile" | sort -u)
# generate a simple menu using nl to number the lines.
nl -n ln <<< "${ips[*]}"
# ask the user to input a number from 1 to $NUMBER_OF_IPS
read -p "Enter a number from 1 to ${#ips[@]}: " -r choice
# do some input validation.
if [[ $choice -lt 1 ]] || [[ $choice -gt "${#ips[@]}" ]]; then
echo "Invalid choice"
exit
fi
# select the ip address that corresponds with choice.
ip="${ips[$((choice - 1))]}"
# harness some grep magic to locate lines with the ip address and pipe that to a second grep
# which will try to match the signature field in the definitions file, e.g. /thinkphp/html/public/index.php.
# the -f flag tells grep to search in a file for patterns. the file in this case is a redirection from
# the cut command.
# the -o flag tells grep to only print the matching portion of the string, which is the signature.
matched="$(grep -F "$ip" $serverFile | grep -o -f <(cut -d' ' -f2- $definitionFile))"
# if we found a match for the chosen ip address AND an exploit signature, find the key for the signature
# by locating it in the definitions file.
if [[ -n "$matched" ]]; then
echo "Intrusion detected from $ip was : $(grep -Fm1 "$matched" definitions.conf | cut -d' ' -f1)"
else
echo 'No results were found'
fi
P.S. If you're going to be writing Bash scripts often, please devote some time to studying best practices. It will really help you.
P.P.S. Don't forget to use shellcheck to help you catch mistakes in your scripts.
Last edited by individual; 01-14-2020 at 08:25 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.