LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-28-2012, 06:10 PM   #1
upendra_35
LQ Newbie
 
Registered: Oct 2012
Posts: 21

Rep: Reputation: Disabled
how to select the table based on regular expression


I have a big table consisting of five columns (see below). I want to filter the table in such a way that only ".1"s (3rd column) are remained in the final table.

I tried to use grep ".1$" but somehow the second row gets included as well. Can someone help me with this.

Thanks
Upendra

PHP Code:
AT1G53670       gene:2024816    AT1G53670.1     located in      chloroplast stroma      GO:0009570      
AT1G53670       gene
:4515100791 AT1G53670.2     has     peptide-methionine (S)-S-oxide reductase activity       GO:0008113      
AT1G53670       gene
:2024816    AT1G53670.1     has protein modification of type        N-terminal protein myristoylation       GO:0006499 
 
Old 11-28-2012, 10:17 PM   #2
towheedm
Member
 
Registered: Sep 2011
Location: Trinidad & Tobago
Distribution: Debian Jessie
Posts: 592

Rep: Reputation: 119Reputation: 119
Firstly, you should use code tags and not PHP tags.

I'm not quite certain whether you would like to show the entire line that contains .1 in the third field or just the contents of the third field that contains .1

In either case, you regex ".1$" means to find any line that ends (the $) with any character (the .) followed by a 1. Since the . is a regex meta-character, it must be escaped if you to include it as part of your regex. So it's strange that grep would return any lines with the regex given.

To list any line that contains a .1 use:
Code:
grep "\.1" < /path/to/file
If you need to return just the third field you will need to use SED or AWK. Even a simple cut can work (assuming your field delimiter is a tab):
Code:
grep "\.1" < /path/to/table | cut -f3
Hope it helps.
 
Old 11-28-2012, 10:34 PM   #3
upendra_35
LQ Newbie
 
Registered: Oct 2012
Posts: 21

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by towheedm View Post
Firstly, you should use code tags and not PHP tags.

I'm not quite certain whether you would like to show the entire line that contains .1 in the third field or just the contents of the third field that contains .1

In either case, you regex ".1$" means to find any line that ends (the $) with any character (the .) followed by a 1. Since the . is a regex meta-character, it must be escaped if you to include it as part of your regex. So it's strange that grep would return any lines with the regex given.

To list any line that contains a .1 use:
Code:
grep "\.1" < /path/to/file
If you need to return just the third field you will need to use SED or AWK. Even a simple cut can work (assuming your field delimiter is a tab):
Code:
grep "\.1" < /path/to/table | cut -f3
Hope it helps.
Thanks.....

I didn't realize that ".1$" will look for .1 at the end of the line and so that might be reason for me getting the 2nd row with the my grep pattern. Anyway i figured out a few minutes of how i would like to use
Code:
.1$/b/
which filters the table based on third column.

Thanks anyway for your help
 
Old 11-30-2012, 09:19 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Although you should try to use grep whenever possible, because it's lighter, more generally awk is the tool to use when working with columnized data.

Code:
awk '$3 ~ /[.]1$/ { print }'
If field 3 matches the given /regex/ pattern, then print it.

Notice also, BTW, that in regex "." means "any character", and so it needs to be either escaped or bracketed to make it literal.

And actually, since the default action on a positive match is to print the line, the "{ print }" part can be left off in this particular case.

Here are a few useful awk references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/man...ode/index.html
http://www.pement.org/awk/awk1line.txt
http://www.catonmat.net/blog/awk-one...ined-part-one/
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression 0.o Programming 3 06-09-2009 03:28 AM
about regular expression '=~m//' littletransformer Programming 7 03-25-2008 09:31 PM
Regular expression datbenik Programming 1 01-05-2006 02:58 PM
Changing a table contents dynamically based on select box rose_bud4201 Programming 4 01-13-2005 11:48 AM
Regular Expression Help WeNdeL Linux - General 1 08-14-2003 11:08 AM


All times are GMT -5. The time now is 10:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration