deleting lines from a file with specific pattern using AWK
Hi,
I have a file which contains milion of records. It contains 12 columns seperated by "||" (delimeter). First two fields contain first name and last name of a person. Now my requirement is to delete all those records from this file for which: First two fields does not contain any alphabet. For e.g i have below mentioned records in file: gaurav||gandhi||123||456||789 #a%bcd||123abc||89|90||91 12345||@@@||89||123||234 ***||!!!!||98||76||90 Now, last two lines should be removed from this file since first two fields does not contain any alphabet for these two records. Please help me out on this....... |
Hi and welcome to LinuxQuestions! If other fields does not contain alphabet characters as in your example, you can simply do:
Code:
awk '/[a-zA-Z]/' file Code:
sed '/[a-zA-Z]/!d' file Code:
awk -F"|" '$1 ~ /[a-zA-Z]/ && $3 ~ /[a-zA-Z]/' file |
Slight adjustment to colucix's last entry as the delimeter is 2 pipes (and in case you weren't aware, you will need to redirect to a new file):
Code:
awk -F"||" '$1 ~ /[a-zA-Z]/ && $3 ~ /[a-zA-Z]/' file > new_file |
Does that work ?. And if it does, wouldn't that be $2 ?.
|
Quote:
Code:
awk -F"[|][|]" '$1 ~ /[a-zA-Z]/ && $2 ~ /[a-zA-Z]/' file > new_file |
Actually I used a single pipe as delimiter and $3 to match the second field ($2 was the null string between the first two pipes).
|
My comment was directed at @grail post, not yours @colucix.
I'll be more specific in future ... ;) |
Mine too. :)
For the sake of the OP, if he will ever pop up again, the field separator in awk can be either a single character or a regular expression. Two or more characters have the side effect to set FS to the last one specified. In the second example posted by grail the presence of two character lists [...] force awk to interpret it as a regular expression, so that you can actually use two consecutive pipes as field separator. Cheers! |
yes ... yes ... shoot me down .. lol
@colucix - thanks for the explanation :) |
o.k., let's continue the education (mine).
Why is "[|][|]" considered regex (in this context) but [||] isn't - [||]+ works. (remember I'm still coming to terms with awk). |
Quote:
Code:
[|][|] The same if you use something like Code:
[|&;][|&;] Code:
|| |& |; && &| &; ;; ;| ;& |
Thanks a lot guys.... my problem is solved now :)
|
Quote:
|
All times are GMT -5. The time now is 08:27 AM. |