LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   grep multiple values in single pass through log file. (http://www.linuxquestions.org/questions/programming-9/grep-multiple-values-in-single-pass-through-log-file-844099/)

1ankit1 11-13-2010 04:04 PM

grep multiple values in single pass through log file.
 
Help needed
I have a huge binary log file.
There are lets say 4 id's that I want to find in a log file.
I know that those 4 id's will be present in the log file and I also know in what order they will be present.
I want to find 1st id from the log then 2nd id and then third id and so on..

Simple/inefficient solution is: Loop through the id's and then grep in the log file. Problem with this solution is for each id grep will search from the beginning of the file.

Better/efficient solution would be: Sine I know the order in which id's will be present in the log file. Loop through id's, grep 1st id and then move on to grep 2nd id and so on...this way I can grep all id's in one pass.
Is this solution possible ?

I have 500000 + values to find in log files and I have to find efficient solution for it. Thanks in advance

Dark_Helmet 11-13-2010 04:24 PM

You should look into regular expressions. Something like this might get what you want:
Code:

egrep "1234|5678|9012|3456" binary_log_file
  • egrep launches grep with support for basic regular expressions
  • "1234|5678|9012|3456" means to match any line that contains one or more of 1234, 5678, 9012, and 3456.
  • binary_log_file is, of course, the file you want to grep against
Replace the four-digit numbers I used with the IDs you're looking for.

Keep in mind, this will cause grep to display all matching lines in whatever order they occur in the file.

EDIT: My mistake, you do not need the escape character if the grep string is in double quotes

BenCollver 11-13-2010 04:45 PM

gzip -dc log.gz | grep -e 1234 -e 5678 -e 90ab -e cdef >results.txt


All times are GMT -5. The time now is 06:14 PM.