LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Compare two fields on consecutive rows and print the two rows (http://www.linuxquestions.org/questions/linux-newbie-8/compare-two-fields-on-consecutive-rows-and-print-the-two-rows-717896/)

aditi_borkar 04-09-2009 06:06 AM

Compare two fields on consecutive rows and print the two rows
 
Hi, I have been trying really hard to get this done. I have an input (I have removed extra lines) that looks like the following:

LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48
LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43
LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44
LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40
LC 5 A:T 20 -1.35 2.32 37.16 -0.08 -0.16 3.13
LC 6 A:T 19 -1.04 2.17 34.61 -0.03 -0.61 3.32
LC 7 T:A 18 0.18 0.17 36.30 0.01 -0.28 3.26
LC 8 T:A 17 1.51 0.77 37.68 0.01 0.04 3.36
LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45
LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52
LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45
LC 12 G:C 13

From these 12 rows, I just want to extract the consecutive rows pairwise that will contain C:G in the first row and G:C in the second row. 4 such combinations can be seen in the lines that I have printed above (1 and 2, 3 and 4, 9 and 10 and 11 and 12). Thus, my output should look like:

LC 1 C:G 24 -0.42 5.56 37.77 -0.19 0.54 3.48
LC 2 G:C 23 1.52 -5.21 35.68 0.65 0.54 3.43
LC 3 C:G 22 2.71 7.44 31.80 -0.01 0.95 3.44
LC 4 G:C 21 0.23 6.00 33.00 -0.23 0.10 3.40
LC 9 C:G 16 -1.84 -0.91 30.29 0.30 1.16 3.45
LC 10 G:C 15 -1.56 -5.19 38.79 -0.93 0.85 3.52
LC 11 C:G 14 2.00 2.15 38.46 0.40 0.61 3.45
LC 12 G:C 13

I have many such input files and it is difficult to manually extract the desired rows.

colucix 04-09-2009 06:27 AM

You can try awk:
Code:

$3 ~ "C:G" {
  first = $0
  getline
  if ( $3 ~ "G:C" ) {
    print first
    print $0
  }
}


aditi_borkar 04-09-2009 06:37 AM

Thanx a lot!!

I had been trying something similar, but it did not work. This script solves my purpose.

malekmustaq 04-09-2009 06:49 AM

aditi_borkar:

What is the type of input?

If you mean a text file containing those lines you can easily extract the needed "G:C" or "C:G" lines therefrom and save it into a file. Use the terminal.

$ grep 'C:G' ./input.txt >./myGCfile
$ cat myGCfile

there you got them into one file.

Try expound more of your problem so that others in the forum will understand what you need and they might be able to help you the better than what I gave here.

Hope this helps. Goodluck.


All times are GMT -5. The time now is 09:29 PM.