Compare two fields on consecutive rows and print the two rows
Hi, I have been trying really hard to get this done. I have an input (I have removed extra lines) that looks like the following:
LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48 LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43 LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44 LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40 LC 5 A:T 20 -1.35 2.32 37.16 -0.08 -0.16 3.13 LC 6 A:T 19 -1.04 2.17 34.61 -0.03 -0.61 3.32 LC 7 T:A 18 0.18 0.17 36.30 0.01 -0.28 3.26 LC 8 T:A 17 1.51 0.77 37.68 0.01 0.04 3.36 LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45 LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52 LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45 LC 12 G:C 13 From these 12 rows, I just want to extract the consecutive rows pairwise that will contain C:G in the first row and G:C in the second row. 4 such combinations can be seen in the lines that I have printed above (1 and 2, 3 and 4, 9 and 10 and 11 and 12). Thus, my output should look like: LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48 LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43 LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44 LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40 LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45 LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52 LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45 LC 12 G:C 13 I have many such input files and it is difficult to manually extract the desired rows. |
You can try awk:
Code:
$3 ~ "C:G" { |
Thanx a lot!!
I had been trying something similar, but it did not work. This script solves my purpose. |
aditi_borkar:
What is the type of input? If you mean a text file containing those lines you can easily extract the needed "G:C" or "C:G" lines therefrom and save it into a file. Use the terminal. $ grep 'C:G' ./input.txt >./myGCfile $ cat myGCfile there you got them into one file. Try expound more of your problem so that others in the forum will understand what you need and they might be able to help you the better than what I gave here. Hope this helps. Goodluck. |
All times are GMT -5. The time now is 05:19 AM. |