how to compare these two files using perl or shell scripting
Hi, All
I want to compare the following two tab-delimited .txt files (both were subsets of the original files) by comparing Columns 3 and 4 simultaneously. It is easy to compare C3 because both C3s are just numbers. But how to compare C4s?
Basically, in File1, "G,G" = G in File2, "C,C" = C in File2, "A,A" = A in File2, "T,T"= T in File2.
In File2, A/T in Column4 just equals "A,T" or "T,A" in Column4 of File1. C/T in Column4 just equals "C,T" or "T,C" in Column4 of File1, and etc.
Any perl expert knows how to do this?
Thanks!
File1:
C1 C2 C3 C4
ih509 rs1234546 7244750 "G,G"
ih499 rs604968 7244911 "C,C"
ih508 - 7244977 "A,A"
ih517 rs285967 7245044 "C,C"
ih505 - 7793139 "G,T"
ih519 rs5847502 7794686 "C,G"
ih520 rs4050682 7794874 "C,T"
ih481 rs126634 7794946 "A,G"
ih513 - 7795116 "C,G"
ih266 rs2270268 8632236 "G,A"
ih265 rs6817637 8632320 "G,G"
ih264 rs2842324 8632610 "A,A"
ih164 rs62345 8632995 "T,C"
ih163 rs4385041 8633106 "C,G"
ih162 rs8495729 8633134 "T,C"
ih165 rs888994784 8633307 "G,G"
ih161 rs403948 8633413 "T,T"
ih33 rs16738274 8633642 "T,C"
ih32 - 8633756 "A,C"
File2:
C1 C2 C3 C4 C5 C6 C7 C8
chr8 7461149 a T 90 90 59 21
chr8 7462712 C T 144 144 57 39
chr8 7463576 * */+TT124 124 43 22
chr8 7464291 T C 102 102 56 25
chr8 7464461 T G 93 93 55 22
chr8 7464620 T C 102 102 57 25
chr8 7465269 A A/T 126 126 55 22
chr8 7465939 G A/G 145 145 57 20
chr8 7467063 c C/T 49 49 51 22
chr8 7467203 g C/G 44 44 34 22
|