Compare two fields on consecutive rows and print the two rows
Hi, I have been trying really hard to get this done. I have an input (I have removed extra lines) that looks like the following:
LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48
LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43
LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44
LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40
LC 5 A:T 20 -1.35 2.32 37.16 -0.08 -0.16 3.13
LC 6 A:T 19 -1.04 2.17 34.61 -0.03 -0.61 3.32
LC 7 T:A 18 0.18 0.17 36.30 0.01 -0.28 3.26
LC 8 T:A 17 1.51 0.77 37.68 0.01 0.04 3.36
LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45
LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52
LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45
LC 12 G:C 13
From these 12 rows, I just want to extract the consecutive rows pairwise that will contain C:G in the first row and G:C in the second row. 4 such combinations can be seen in the lines that I have printed above (1 and 2, 3 and 4, 9 and 10 and 11 and 12). Thus, my output should look like:
LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48
LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43
LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44
LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40
LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45
LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52
LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45
LC 12 G:C 13
I have many such input files and it is difficult to manually extract the desired rows.
|