LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sed command for pattern search (https://www.linuxquestions.org/questions/linux-newbie-8/sed-command-for-pattern-search-4175644445/)

nextStep 12-17-2018 05:50 AM

sed command for pattern search
 
Hi All,

I have a requirement to search for a particular pattern from the log files every 4 hours. I have set up a cron which runs the script every 4 hours.

Log file pattern as below.

2018-12-17 01:53:47,390 [pool-3-thread-1] INFO [traceId=7c87f067fca636df,spanId=7c87f067fca636df] c.b.s.m.a.update.EmailProcessor - =========== Is email body HTML?

2018-12-17 04:47:21,838 [ActiveMQ Task-1] INFO [traceId=,spanId=] o.a.a.t.failover.FailoverTransport - Successfully connected to tcp://xx.xx.xx.xx:31313

After some research the below sed command was found

Code:

sed -e "1,/^$(date -d -4hour +'%Y-%m-%d %H')/d"  /home/nextStep/Logcheck.txt
But the issue here is , it outputs all the text in the last 4 hours. My requirement is to find if the below pattern exists in the last 4 hours.
c.b.s.m.a.update.EmailProcessor

How to append the pattern in the above code.

Thanks for the help.

pan64 12-17-2018 06:00 AM

looks like homework for me
you may try sed -n and add a second sed expression to look for the specified text.

l0f4r0 12-17-2018 06:40 AM

Some remarks:
  • you do not need switch -e if you only have one sed script
  • I'm not sure why you use instruction d...
  • OK, I see you are using a regex as a line range end. Do you know that the end line number will be the first one that is going to match your regex (maybe not what you want if there are multiple logs in the last 4-hour span), and if there is no match, then every lines will be taken into account?
  • do you expect a "yes"/"no" output? If you want that, maybe awk would be more appropriate...

BW-userx 12-17-2018 07:11 AM

Quote:

Originally Posted by nextStep (Post 5938139)
Hi All,

I have a requirement to search for a particular pattern from the log files every 4 hours. I have set up a cron which runs the script every 4 hours.

Log file pattern as below.

2018-12-17 01:53:47,390 [pool-3-thread-1] INFO [traceId=7c87f067fca636df,spanId=7c87f067fca636df] c.b.s.m.a.update.EmailProcessor - =========== Is email body HTML?

2018-12-17 04:47:21,838 [ActiveMQ Task-1] INFO [traceId=,spanId=] o.a.a.t.failover.FailoverTransport - Successfully connected to tcp://xx.xx.xx.xx:31313

After some research the below sed command was found

Code:

sed -e "1,/^$(date -d -4hour +'%Y-%m-%d %H')/d"  /home/nextStep/Logcheck.txt
But the issue here is , it outputs all the text in the last 4 hours. My requirement is to find if the below pattern exists in the last 4 hours.
c.b.s.m.a.update.EmailProcessor

How to append the pattern in the above code.

Thanks for the help.

do you have to use sed?
other programs/commands can search inside of files to find a match to, 'c.b.s.m.a.update.EmailProcessor' if true then log it true, if false then log it false.

nextStep 12-17-2018 08:07 AM

Quote:

Originally Posted by pan64 (Post 5938142)
looks like homework for me
you may try sed -n and add a second sed expression to look for the specified text.

Hi

I tried the below approach, but couldnot get the result expected .

Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d" | sed -n "c.b.s.m.a.update.EmailProcessor" /home/nextStep/Logcheck.txt

BW-userx 12-17-2018 08:25 AM

let's review.
Quote:

Originally Posted by OP
I have a requirement to search for a particular pattern from the log files every 4 hours. I have set up a cron which runs the script every 4 hours.

My requirement is to find if the below pattern exists in the last 4 hours.
c.b.s.m.a.update.EmailProcessor

1. set up a cron job to fire off every 4 hours.
2. job is to search log files looking for an entry "c.b.s.m.a.update.EmailProcessor' within same said log files.
3. If found ?? If not found ??

seding for a date within a file is redundant. The log files are already going to be four hours old, every 4 hours, entries are (maybe) added to all of the log files.

So logic states what about the log files condition each 4 hours they are searched, again?

plus your sed statment is malformed.

grep has a return value
sed does not, from what I've read on it thus far.

grep -q

grep retrun code
Code:

The code 1 is because of no lines matching from the input.
Also to read on EXIT CODES on man grep page, EXIT STATUS
Normally the exit status is 0 if a line is selected, 1 if
no lines were selected, and 2 if an error occurred. The
exit code is 1 because nothing was matched by grep.


l0f4r0 12-17-2018 08:38 AM

Quote:

Originally Posted by nextStep (Post 5938187)
Hi

I tried the below approach, but couldnot get the result expected .

Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d" | sed -n "c.b.s.m.a.update.EmailProcessor" /home/nextStep/Logcheck.txt

Indeed, neither your first command, nor the 2nd one can work as is...
If I were you I would do read sed man page (simply type "man sed" in your terminal) and/or do some additional searches on the internet.

scasey 12-17-2018 10:29 AM

As has been asked, why sed? sed stands for stream editor. Your OP says
Quote:

I have a requirement to search for a particular pattern
(emphasis added).
IMO, sed is not the appropriate tool. I'd use grep, i.e.
Code:

grep '.*c.b.s.m.a.update.EmailProcessor.*' /home/nextStep/Logcheck.txt
Yes, that will show you the same lines every time it runs...not just the lines added in the last four hours. You'll need to analyze the result and parse it some more to tune the result.

l0f4r0 12-17-2018 11:17 AM

Quote:

Originally Posted by scasey (Post 5938233)
IMO, sed is not the appropriate tool. I'd use grep, i.e.
Code:

grep '.*c.b.s.m.a.update.EmailProcessor.*' /home/nextStep/Logcheck.txt

No need to use .* as boundaries for your grep pattern

scasey 12-17-2018 11:31 AM

Quote:

Originally Posted by l0f4r0 (Post 5938241)
No need to use .* as boundaries for your grep pattern

I knew that...my bad...don't know what I was thinking. Thanks for pointing it out...better for others.;)

syg00 12-17-2018 05:01 PM

C'mon folks, the OP is trying, let's help. And *YES* the command posted originally works exactly as it's supposed to.
Quote:

Originally Posted by nextStep (Post 5938187)
I tried the below approach, but couldnot get the result expected .

Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d" | sed -n "c.b.s.m.a.update.EmailProcessor" /home/nextStep/Logcheck.txt

Close - you need the input file on the first sed, and by using "-n" in the second sed you are suppressing all printing, so you have to explicitly print the lines you want. Try it like this.
Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d" /home/nextStep/Logcheck.txt | sed -n "/c.b.s.m.a.update.EmailProcessor/p"
As suggested above, it is possibly better to only call sed once
Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d"  -ne "/c.b.s.m.a.update.EmailProcessor/p" /home/nextStep/Logcheck.txt

BW-userx 12-17-2018 05:31 PM

Code:


grep -q "c.b.s.m.a.update.EmailProcessor" " /home/nextStep/Logcheck.txt"  && echo "yes" || echo "no"

works off return code 1 | 0

or

Code:

grep -q "c.b.s.m.a.update.EmailProcessor" "$1"  && \
echo "yes found  $(grep -o "c.b.s.m.a.update.EmailProcessor" "$1" | wc -l ) times" || \
echo "no"

results.
Code:

$ ./findpatterengrep testfile
yes found  3 times

placed within a bunch of text 3 times, and returned 3 times it was found. Then one could just do the math on each run to see the increments in finds.

pan64 12-18-2018 02:40 AM

Quote:

Originally Posted by BW-userx (Post 5938363)
Code:


grep -q "c.b.s.m.a.update.EmailProcessor" " /home/nextStep/Logcheck.txt"  && echo "yes" || echo "no"


there is a space before /home in filename, which make this command useless.
Quote:

Originally Posted by BW-userx (Post 5938363)
Code:

grep -q "c.b.s.m.a.update.EmailProcessor" "$1"  && \
echo "yes found  $(grep -o "c.b.s.m.a.update.EmailProcessor" "$1" | wc -l ) times" || \
echo "no"


use grep -c instead of grep | wc
And do not repeat the same grep. That is just wasting the resources and time.
grep cannot handle the 4 hours requirement, sed can do the search for you

pan64 12-18-2018 02:42 AM

Quote:

Originally Posted by syg00 (Post 5938350)
As suggested above, it is possibly better to only call sed once
Code:

sed -e "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d"  -ne "/c.b.s.m.a.update.EmailProcessor/p" /home/nextStep/Logcheck.txt

and probably works:
Code:

sed -n "1,/^$(date -d -1hour +'%Y-%m-%d %H')/d;/c.b.s.m.a.update.EmailProcessor/p" /home/nextStep/Logcheck.txt

BW-userx 12-18-2018 07:29 AM

Quote:

Originally Posted by pan64 (Post 5938489)
there is a space before /home in filename, which make this command useless.

This was not posted for copy paste efficiency. I just tossed that in here after I pasted it from my test code, that is where the space came from. It was "$1" to use the CLI.

That is a neither, here nor there, due to, if the user OP cannot figure out why it is not working due a space because of a copy paste using of answers in any forum, then it is a good exercise to be used as a learning tool, to test any code gotten from somewhere else before putting to use.
Quote:

Originally Posted by pan64 (Post 5938489)
Use grep -c instead of grep | wc
And do not repeat the same grep. That is just wasting the resources and time.
grep cannot handle the 4 hours requirement, sed can do the search for you

point taken. As one little file is not a good test for reality purposes.


All times are GMT -5. The time now is 05:42 PM.