search file and extract lines matching array and within date range
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Some improvement proposals:
1.
grep takes newline-separated patterns.
2.
It's more efficient to redirect the whole loop; further it gives you the choice between
done > filename and
done >> filename.
3.
< filename ... is more efficient than
cat filename | ...
4.
cd once and make sure it was successful!
5. Quotes around $var[@] and $@ keep the elements but prevent further word splitting and glob-expansion.
Code:
last7days=$(
date +%Y-%m-%d -d "7 day ago"
date +%Y-%m-%d -d "6 day ago"
date +%Y-%m-%d -d "5 day ago"
date +%Y-%m-%d -d "4 day ago"
date +%Y-%m-%d -d "3 day ago"
date +%Y-%m-%d -d "2 day ago"
date +%Y-%m-%d -d "1 day ago"
date "+%Y-%m-%d"
)
...
cd $folder || exit
for val in "${StringArray[@]}"
do
for filename in ./*_${timestamp}.xml
do
### below statement seeks existence of each element in StringArray
# if found, pipe to determine if changed in the last 7 days. pipe to outfile if meets all criteria
## 7-day capture
< "$filename" grep -A1 "$val" | grep -B2 "$last7days"
done
done >> $outfile
Just for fun, a solution that uses gawk so that the 'mktime' function can be used to handle time stamps with greater flexibility. Also it saves multiple reads of the log files for each user.
Code:
#!/bin/bash
## Date setup for feeding into the process
start_date=$(date +%Y-%m-%d -d "7 day ago")
start_time="00:00:00"
end_date=$(date "+%Y-%m-%d")
end_time="23:59:59"
prevday=`date -d yesterday '+%Y%m%d'`
## remove the previous days file - housekeeping to reduce file buildup
#rm /tmp/svnaudit/svnaudit_$prevday.txt
timestamp=`date +%Y%m%d`
#### create file/dir variables
#folder=/mnt/midtier_logs/report/Audit
#outfile=/tmp/svnaudit/svnaudit_$timestamp.txt
outfile="outfile.txt"
#emailfile=/tmp/svnaudit/svnaudit_emailme.txt
#rm $emailfile
#cd $folder || exit
for filename in ./*_${timestamp}.xml; do
gawk -v start_date=$start_date -v start_time=$start_time -v end_date=$end_date -v end_time=$end_time '
BEGIN {
FIELDWIDTHS="6:10 1:8"
split(start_date,d,"-")
split(start_time,t,":")
st=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
split(end_date,d,"-")
split(end_time,t,":")
et=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
}
/revision=/ {rev=$0}
/user1|user12|user30|dev1|dev5|dev25|dev15|dev4/ {aut=$0}
/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {
split($1,d,"-")
split($2,t,":")
ct=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
if (ct > st && ct < et) {
print rev
print aut
print
}
}' "$filename" >> "$outfile"
done
Notes:
1. I have used the FIELDWIDTHS variable to make it easy to isolate the date and time from the log file, as the line seems to be consistently created in the example data.
2. I have commented out the local file handling.
Also it saves multiple reads of the log files for each user.
I used to worry about such things, but for most (sane) sized files, this is no longer an issue - if the entire file remains RAM resident in the page-cache, no (extra) disk I/O ensues.
Just for fun, a solution that uses gawk so that the 'mktime' function can be used to handle time stamps with greater flexibility. Also it saves multiple reads of the log files for each user.
Code:
#!/bin/bash
## Date setup for feeding into the process
start_date=$(date +%Y-%m-%d -d "7 day ago")
start_time="00:00:00"
end_date=$(date "+%Y-%m-%d")
end_time="23:59:59"
prevday=`date -d yesterday '+%Y%m%d'`
## remove the previous days file - housekeeping to reduce file buildup
#rm /tmp/svnaudit/svnaudit_$prevday.txt
timestamp=`date +%Y%m%d`
#### create file/dir variables
#folder=/mnt/midtier_logs/report/Audit
#outfile=/tmp/svnaudit/svnaudit_$timestamp.txt
outfile="outfile.txt"
#emailfile=/tmp/svnaudit/svnaudit_emailme.txt
#rm $emailfile
#cd $folder || exit
for filename in ./*_${timestamp}.xml; do
gawk -v start_date=$start_date -v start_time=$start_time -v end_date=$end_date -v end_time=$end_time '
BEGIN {
FIELDWIDTHS="6:10 1:8"
split(start_date,d,"-")
split(start_time,t,":")
st=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
split(end_date,d,"-")
split(end_time,t,":")
et=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
}
/revision=/ {rev=$0}
/user1|user12|user30|dev1|dev5|dev25|dev15|dev4/ {aut=$0}
/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {
split($1,d,"-")
split($2,t,":")
ct=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3])
if (ct > st && ct < et) {
print rev
print aut
print
}
}' "$filename" >> "$outfile"
done
Notes:
1. I have used the FIELDWIDTHS variable to make it easy to isolate the date and time from the log file, as the line seems to be consistently created in the example data.
2. I have commented out the local file handling.
Again, use efficient redirection (especially if you want to reduce I/O operations!)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.