LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   sort the files and delete the duplicate lines that contain a pattern (https://www.linuxquestions.org/questions/linux-newbie-8/sort-the-files-and-delete-the-duplicate-lines-that-contain-a-pattern-917726/)

ammu 12-08-2011 06:32 AM

sort the files and delete the duplicate lines that contain a pattern
 
Hi,

I need to sort a file and delete the duplicate lines that contains the particular pattern.

for ex:

sort output is:

hi
hi
hi
how r u
how r u
how r u

i'm searching for "hi" i need to delete that duplicate alone not how r u

output need to be

hi
how r u
how r u
how r u

thanks...

klearview 12-08-2011 07:39 AM

See if this works for you:

Quote:

sort your_file_name | grep -C 1 -v 'hi'

ammu 12-08-2011 09:57 PM

sorry

It's not working. I need to delete the all the duplicate which contains my search pattern

thanks...

David the H. 12-09-2011 12:54 AM

How exactly is it "not working"? What output do you get?
But yeah, I don't think it works as promised.


This awk command should work. It sets a flag after the first match so that subsequent lines with the same pattern won't print:
Code:

awk '( no && $0 ~ pat ) { next } ; $0 ~ pat { no=1 } ; { print }' "pat=hi" <( sort file.txt )
The first argument fed to awk is a variable setting the pattern you want to match ("pat=hi"). You have to quote the whole thing so that it's not treated as a shell variable. The second argument is a process substitution that supplies the pre-sorted file.

Edit: I just realized...the input pattern is treated as a regular expression, so the above will match any line containing that substring. To force exact line matches only, use "pat=^hi$" (or another regex that targets only what you want).


Finally, please use [code][/code] tags around your code and data, to preserve formatting and to improve readability.

ammu 12-09-2011 01:11 AM

thank you
 
hi,

thank you so much... it's working..
thanks once again


All times are GMT -5. The time now is 03:27 PM.