grep file
I have a large plain text file , I would like to extract part of text from it , the condition is
1) extract the line that have the text "2011" and also 2) extract the line that have the text "warning" & "jobs" . can advise what can i do ? thx |
This should work
cat somebigfile.txt |grep -ie 2011 -ie warning -ie jobs HTH |
thx reply ,
But I also would like to exclude those line do not have "warning" & "jobs" , what can i do ? Thanks. |
cat somebigfile.txt |grep -ie 2011 -iv warning -iv jobs
but you probably need to filter it twice (first output to a tmp file, cat the tmp file and grep it again) so you'd get your results. Sorry, it has been a long day and can't wait to go off. |
@ust: It usually helps to post an actual example of your text, along with an example of the desired output.
And remember to please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. @aazkan: A few comments on this: Code:
cat somebigfile.txt |grep -ie 2011 -iv warning -iv jobs 2) While it's not necessary as long as we're only searching for single simple words, you really should get into the habit of always quoting the expressions. Without quotes, any shell-reserved characters in them will be interpreted, and it will be broken up into separate arguments on any whitespace, probably breaking the command. I personally think it also helps readability, as the expression being searched for is clearly differentiated from the other arguments around it. Read these three links for a better understanding of how the shell handles arguments and whitespace: http://mywiki.wooledge.org/Arguments http://mywiki.wooledge.org/WordSplitting http://mywiki.wooledge.org/Quotes 3) When using multiple expressions at once in grep, you need to prefix each and every one of them with the -e option. Also, the -i and -v options only need to be given once, as they unfortunately always apply globally. It's thus impossible in grep to simultaneously print lines containing one pattern and exclude lines containing another. You'd have to chain two grep commands together, or use a different tool such as sed to do that. Code:
So in the above, the first grep command simply outputs lines that contain the string "2011", and pipes them into a second grep for further filtering. There, the -i option indicates case-insensitive matching, and the -v option inverts the output. The two -e expressions are thus the strings that we want to exclude. So note that this apparently does NOT do what the OP asked (although we could use some clarification on this, as I mentioned above). This prints only the lines containing "2011" that don't also have warning or jobs in them. If I'm reading correctly though, I believe what he wants are lines that contain both 2011 and either warning or jobs. In which case, just remove the -v flag from the second grep. The sed command does exactly the same thing as the two grep commands. First I used -r to enable extended regex (explained below), and -n to turn off printing by default. The -e option is similar to grep's. /../ :indicates a regex pattern to match, in this case lines containing "2011". {..} :groups the subsequent commands that operate on the first match. /../ :again matches lines, this time from the results of the previous match. (..|..) :means match either A or B. This is the part that needs the -r option. Although in gnu sed you can instead backslash escape the bracketing characters for the same effect ("\(..\|..\)"), I think it's cleaner just to enable extended regex. ! :inverts the condition of the match, similar to grep's -v option. p :finally, the p command prints the resulting matches. So again, just remove the ! to only get lines that contain both patterns. @ust: I recommend that you read the man and info pages for grep and sed. It's generally a good idea to take the time to learn how the tools you're using really work. Here are some useful sed resources for you too: http://www.grymoire.com/Unix/Sed.html http://sed.sourceforge.net/grabbag/ http://sed.sourceforge.net/sedfaq.html http://sed.sourceforge.net/sed1line.txt You should also learn at least the basics of regular expressions: http://mywiki.wooledge.org/RegularExpression http://www.grymoire.com/Unix/Regular.html |
The example :
the file content is 2011 2011 aaa warning jobs bbb warning ccc jobs ddd then the output should be as below , can advise . Thanks. 2011 2011 aaa warning jobs |
Hi David,
Thanks for the pointers and appreciate your input. I'll work on my future replies to question. Regards. |
Quote:
|
All times are GMT -5. The time now is 07:48 PM. |