LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to compare two lines and delete the duplicate line from a file? (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-compare-two-lines-and-delete-the-duplicate-line-from-a-file-688177/)

Shobhna 12-03-2008 11:25 PM

How to compare two lines and delete the duplicate line from a file?
 
Hi Friends,

I have a file with contents as given below.

1111|9999|||1|WHI1|Name1||0|0||
1111|9999|1111CS|9999|2|WHI1|Name1||0|0||
1111|55555|||1|MER|Name2||0|0||
1111|55555|22222|55555|2|MER|Name2||0|0||

I want to compare two fields separated by "|" symbol from each line and delete the entire line if there is any duplicate.

Example:

1111|9999|||1|WHI1|Name1||0|0||
1111|9999|1111CS|9999|2|WHI1|Name1||0|0||

In these two lines, the value "1111|9999" is matching and the second line (which is duplicate) should be removed from the file.

The search should continue till the end of the file to remove the duplicates.

Can you please help me on this.

Thanks in advance.

Have a great day!!!

Tinkster 12-04-2008 01:53 AM

Hi,

And welcome to LQ!

You're positive that the second line with the same
matching field is always the duplicate that needs
to be removed?
Code:

sort -t\| -k 1,2 -u dupes.txt
1111|55555|||1|MER|Name2||0|0||
1111|9999|||1|WHI1|Name1||0|0||


Shobhna 12-04-2008 02:14 AM

Yes, the second line will always be duplicate if it matches.
Thanks a lot for your help :)

Shobhna 12-04-2008 04:03 AM

Hi,

From the above example file, is it possible to export the fields separated by "|" to an excel file?

Can you please let me know.

Thanks..

Tinkster 12-04-2008 04:09 AM

Well, not strictly speaking an Excel-file, but it's trivial
to convert it to CSV which excel will most happily open
directly
Code:

sort -t\| -k 1,2 -u dupes.txt | sed -e 's/|/","/g' -e 's/^/"/' -e 's/$/"/' > no_dupes.csv
*if* this is what you mean by "exporting to an excel file".
If it's not - please explain in more detail what you're
trying to achieve.

Shobhna 12-04-2008 04:23 AM

Thanks a lot..
I'll try this.

skob 12-04-2008 05:07 AM

in excel try file->import and there you can specify the delimiter char to '|' or whatever you need

Shobhna 12-04-2008 08:40 AM

Hi,

Can somebody help me on this:

How to add a new field to end of each line in a file based on a condition?

For Example,

If the 6th field is "MER" concatenate "NEW1" and "Mercury" to the end of the line. And if it's "WHI1", concatenate "NEW2" and "White" to the end of the line separated by "|" as shown below.

Original File:
1111|55555|||1|MER|Name2||0|0||
1111|9999|||1|WHI1|Name1||0|0||

New File:
1111|55555|||1|MER|Name2||0|0||NEW1|Mercury
1111|9999|||1|WHI1|Name1||0|0||NEW2|White

Thank you.

Tinkster 12-04-2008 12:13 PM

Code:

awk -F\| '$6 ~ /MER/ {print $0"|New1|Mercury"} $6 ~ /WHI1/{print $0"|New2|White"}' dupes.txt
1111|9999|||1|WHI1|Name1||0|0|||New2|White
1111|55555|||1|MER|Name2||0|0|||New1|Mercury

Cheers,
Tink

Shobhna 12-04-2008 10:53 PM

Thank you :-)

Tinkster 12-05-2008 02:08 PM

welcome ... ;}


All times are GMT -5. The time now is 03:47 AM.