LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-26-2012, 12:23 PM   #1
upendra_35
LQ Newbie
 
Registered: Oct 2012
Posts: 21

Rep: Reputation: Disabled
filter table


Can someone tell how to filter the below table in the way that i want

Here is my table

PHP Code:
BR_598    comp284262_c0_seq1
BR_644    TCONS_00025984
BR_644    TCONS_00025984
BR_644    TCONS_00007333
BR_644    TCONS_00007334
BR_644    TCONS_00007334
BR_734    TCONS_00073491
BR_756    comp262969_c0_seq6
BR_756    comp262969_c0_seq6
BR_771    comp265886_c0_seq4
BR_771    comp265886_c0_seq4
BR_771    TCONS_00062419
BR_771    TCONS_00062419
BR_771    TCONS_00062419
BR_931    TCONS_00052085
BR_976    TCONS_00022581
BR_993    comp237630_c0_seq4
BR_1032    TCONS_00032494
BR_1032    TCONS_00032494
BR_1032    TCONS_00032494
BR_1032    TCONS_00032494
BR_1032    TCONS_00032496
BR_1032    TCONS_00032496
BR_1108    TCONS_00068443
BR_1109    TCONS_00068443
BR_1110    TCONS_00053482
BR_1110    TCONS_00053482
BR_1110    TCONS_00053481
BR_1110    TCONS_00053481
BR_1110    TCONS_00060345
BR_1110    TCONS_00026301
BR_1146    TCONS_00026075
BR_1146    TCONS_00074006
BR_1163    comp274327_c0_seq1
BR_1163    comp274327_c0_seq1 
So I only want those genes in column 1 that have hits to two different databases. For example from the above table all i want is BR_771 because it hit both databases

PHP Code:
BR_771    comp265886_c0_seq4
BR_771    TCONS_00062419 
Thanks
 
Old 10-26-2012, 03:03 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,354
Blog Entries: 55

Rep: Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541
Quote:
Originally Posted by upendra_35 View Post
(..) i want (..) I only want (..) all i want
What you want is 'man grep' wrt 'grep BR_771 /path/to/file' or 'someoutput | grep BR_771'.
 
Old 10-26-2012, 03:16 PM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
You don't specify the language, anyway here is an awk solution:
Code:
awk '{
  $2 ~ /TCONS/ ? _[$1] = $2 : __[$1] = $2
} 
END { 
  for ( i in _ )
    if ( i in __ ) {
      print i, _[i]
      print i, __[i]
    }
}' file
Please note that _ and __ are simply array names (you can choose a and b or anything else at your pleasure). It is not clear anyway if it's possible that a gene matches more than two databases and if you want to print out all the matches in that case. The suggested code works only for two database names as from your example. Hope this helps.
 
1 members found this post helpful.
Old 10-26-2012, 03:21 PM   #4
Heraton
Member
 
Registered: Apr 2011
Location: Germany
Distribution: Mint 10, openSuSE
Posts: 58

Rep: Reputation: 3
have a look at uniq too

Hello!

To get rid of all those duplicates you might want to try something like that:
Code:
cat databasefile | uniq
This will make your work less painful.

Regards, Heraton

edit: Well, too late once again...

Last edited by Heraton; 10-26-2012 at 03:23 PM. Reason: too late
 
Old 10-26-2012, 06:46 PM   #5
upendra_35
LQ Newbie
 
Registered: Oct 2012
Posts: 21

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
You don't specify the language, anyway here is an awk solution:
Code:
awk '{
  $2 ~ /TCONS/ ? _[$1] = $2 : __[$1] = $2
} 
END { 
  for ( i in _ )
    if ( i in __ ) {
      print i, _[i]
      print i, __[i]
    }
}' file
Please note that _ and __ are simply array names (you can choose a and b or anything else at your pleasure). It is not clear anyway if it's possible that a gene matches more than two databases and if you want to print out all the matches in that case. The suggested code works only for two database names as from your example. Hope this helps.
Hi colucix, thank for the script... There is a typo in your script but apart from that everything was perfect. Here is the modified script

PHP Code:
#! /bin/sh
file=$1

awk 
'{
  $2 ~ /TCONS/ ? _[$1] = $2 : __[$1] = $2
}
END {
  for ( i in _ )
    if ( i in __ ) {
      print i, _[i]
      print i, __[i]
    }
}' 
$file 
PHP Code:
Usagesh union.awk test_awk 
Sorry i am not too familiar with awk and so let me know if i made any mistakes in there (however it worked ok).
Thanks again man!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
can't initialize iptables table `filter': Table does not exist... sodek Linux - Software 3 05-07-2012 02:54 AM
iptables can't initialize iptables table `filter': Bad file descriptor donalbane Linux - Networking 2 08-17-2011 08:36 AM
can't initialize iptables table `filter' mahmoodn Linux - Networking 10 04-30-2011 12:19 PM
iptables v1.3.8: can't initialize iptables table `filter' sebastien.lorandel Linux - Networking 11 09-22-2007 06:34 AM
filter /etc/messages for certain IP table entries don_wombat Linux - Software 0 12-14-2004 03:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration