LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   how to compare these two files using perl or shell scripting (https://www.linuxquestions.org/questions/programming-9/how-to-compare-these-two-files-using-perl-or-shell-scripting-794039/)

cliffyao 03-08-2010 02:23 PM

how to compare these two files using perl or shell scripting
 
Hi, All

I want to compare the following two tab-delimited .txt files (both were subsets of the original files) by comparing Columns 3 and 4 simultaneously. It is easy to compare C3 because both C3s are just numbers. But how to compare C4s?

Basically, in File1, "G,G" = G in File2, "C,C" = C in File2, "A,A" = A in File2, "T,T"= T in File2.

In File2, A/T in Column4 just equals "A,T" or "T,A" in Column4 of File1. C/T in Column4 just equals "C,T" or "T,C" in Column4 of File1, and etc.

Any perl expert knows how to do this?

Thanks!

File1:

C1 C2 C3 C4
ih509 rs1234546 7244750 "G,G"
ih499 rs604968 7244911 "C,C"
ih508 - 7244977 "A,A"
ih517 rs285967 7245044 "C,C"
ih505 - 7793139 "G,T"
ih519 rs5847502 7794686 "C,G"
ih520 rs4050682 7794874 "C,T"
ih481 rs126634 7794946 "A,G"
ih513 - 7795116 "C,G"
ih266 rs2270268 8632236 "G,A"
ih265 rs6817637 8632320 "G,G"
ih264 rs2842324 8632610 "A,A"
ih164 rs62345 8632995 "T,C"
ih163 rs4385041 8633106 "C,G"
ih162 rs8495729 8633134 "T,C"
ih165 rs888994784 8633307 "G,G"
ih161 rs403948 8633413 "T,T"
ih33 rs16738274 8633642 "T,C"
ih32 - 8633756 "A,C"

File2:
C1 C2 C3 C4 C5 C6 C7 C8
chr8 7461149 a T 90 90 59 21
chr8 7462712 C T 144 144 57 39
chr8 7463576 * */+TT124 124 43 22
chr8 7464291 T C 102 102 56 25
chr8 7464461 T G 93 93 55 22
chr8 7464620 T C 102 102 57 25
chr8 7465269 A A/T 126 126 55 22
chr8 7465939 G A/G 145 145 57 20
chr8 7467063 c C/T 49 49 51 22
chr8 7467203 g C/G 44 44 34 22

rweaver 03-08-2010 03:06 PM

You would basically just split them on the tab and compare them via an if, since you know the formats your data can come in it's just a series of ifs--

Code:

if (($var1 eq "G") && ($var2 eq "G,G")) { whatever }
Edit: there are other ways but this is the KISS method, anyone can program it regardless of skill level.

Sergei Steshenko 03-08-2010 03:23 PM

Quote:

Originally Posted by rweaver (Post 3890628)
You would basically just split them on the tab and compare them via an if, since you know the formats your data can come in it's just a series of ifs--

Code:

if (($var1 eq "G") && ($var2 eq "G,G")) { whatever }
Edit: there are other ways but this is the KISS method, anyone can program it regardless of skill level.

perldoc -f split


All times are GMT -5. The time now is 11:22 AM.