LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-28-2013, 04:15 PM   #1
asrshell
LQ Newbie
 
Registered: Jan 2013
Distribution: ubuntu
Posts: 2

Rep: Reputation: Disabled
Post need a column starting from a specific pattern


Hi !! Have a look please if anyone can help me (I'm indeed a new bash learner). I've a file let's call it 'test.txt' which contain the followings (more precisely its an alignment file)

A-1 AAAAAAAKGAAKAAAAAAAAAAAAAAAA
A-2 ELEEEEEEEEEEEEEEEEESWEEEEEEEE
A-3 JJJLJJJJJJJJJJJJJJJWWJJJJJJJ

A-4 LLLHLIDDFRRRLLLLLLLLLLLGHLLLLLL
A-5 UUUGUUARRRHUUUUUUUUUUUJJUU
A-6 GFGFJYHFRRRGFRDCDAGGF.........

A-7 BBBBBBBBBBBAWBBBBBBBABBBSBBB
A-8 XXXXXFGXDXXXSXXXXXXXXXXXXXXX
A-9 ZZZDZZZZZZZZZZZZZZHZZZHZZZGZ

A-10 DDDDDDHDDDDIDDDIRRRDDDDDDKDDDD
A-11 QQQQHFQQQQQIQQQIRRRQQQQQQKQQQQ
A-12 IIIWWWWWWDDDIIIIIIIRRRIIIIIIKIIIILLL

now how can i get 'n'th column starting from a specif pattern such as 'RRR'(as for e.g. from the above text file, ) ('n'th column before this pattern or after this pattern)

thanks in advance

Last edited by asrshell; 01-28-2013 at 04:24 PM. Reason: i didn't find the text as i typed
 
Old 01-28-2013, 04:52 PM   #2
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,790

Rep: Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653
Code:
egrep -o 'RRR[A-Z]{5}[A-Z]' test.txt | egrep -o '[A-Z]$'
The '5' is the number of columns between 'RRR' and the column you want
 
1 members found this post helpful.
Old 01-28-2013, 09:24 PM   #3
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,800
Blog Entries: 4

Rep: Reputation: 286Reputation: 286Reputation: 286
In more generalized way, a one-line awk could do the job:
Code:
awk 'BEGIN{FS=" "}; /<search_pattern>/ {print $<column>}' test.txt
So let's say, if you want to print 1st coulmn of all lines having pattern "RRR", then do as:
Code:
awk 'BEGIN{FS=" "}; /RRR/ {print $1}' test.txt
Output:
Code:
A-4
A-5
A-6
A-10
A-11
A-12
 
Old 01-29-2013, 12:08 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I'm curious about what the OP considers to be a column? (as is evident from the 2 very different solutions so far)
 
Old 01-29-2013, 10:41 AM   #5
asrshell
LQ Newbie
 
Registered: Jan 2013
Distribution: ubuntu
Posts: 2

Original Poster
Rep: Reputation: Disabled
Post print 'n'th column (before or after) starting from a pattern

Thanks both of you kbp and shivaa for answering

shivaa: your code isn't producing what i want. please consider each character as column (anyway your code help me to solve some of my other problems).

Kbp: cheers!! your code is working but only 'after' the given pattern (i.e. it's generating desired column from the 'right' side of the pattern).


1. How can it work 'before' the pattern also (i.e. from the 'left' side of the pattern)?
2. Does it possible to print the output with line header (in that e.g.A-4, A-5, A-6 etc. these are line header)


(additional query to all)
Let's i have a file called 'test.txt' contains as below where A-1 , A-2, A-3 etc are line headers. (there is always same space after each line header. there is also space between paragraph.)

A-1 AARAAAAAARAAAAAAAARAAAAAAZA
A-2 ARAAAAAAAKAAAAAAARARAAYAAAA
A-3 AARARAAYAKAARAAAAAAAAAAAAAA

A-1 ZAAAAAAAARRRAAAAAAAAAAAAAAA
A-2 AAYAAAAAARRRAAAAAAAAAAAAAAA
A-3 YAAAAAAAARRRAAAAAAAAAAAAAAA

A-1 AAZAAAAAAKAARARAAAQAAAAAARA
A-2 AAYRARARAKAAAAAAAAAAAAAAAQA
A-3 ARYAARAAAKAAAAAAAAQAAAAAARA

A-1 AAAZAAAAARRRAAAAAAAAAAAAAAA
A-2 AAYAAAAAARRRAAAAAAAAAAAAAAA
A-3 AAAAAAYAARRRAAAAAAAAAAAAAAA

3. Is it possible to get 'n' th column starting from a pattern to lines/paragraph (before or after the pattern) where the the pattern is absent. Let's consider above e.g.

How can i print 'n'th column from paragraph 1 or 3 starting count from a pattern 'RRR' (which is present in the 2nd and 4th paragraph) for e.g. 18th(ignoring line header and white space) column starting from the left side of the 2nd 'RRR' pattern or starting from the right side of the 1st 'RRR' pattern?

it would be nice if the output prints with corresponding line header. so briefly i would be happy if i got a output like this

For first case
A-1 Q
A-2 A
A-3 Q

for 2nd case
A-1 Z
A-2 Y
A-3 Y
 
Old 01-29-2013, 01:43 PM   #6
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,800
Blog Entries: 4

Rep: Reputation: 286Reputation: 286Reputation: 286
Sorry if I misunderstood it. Anyway, if you consider RRR as field seperator, then you can specify it as:-
Print after RRR:
Code:
~$ awk 'BEGIN{FS="RRR"}; NF>1 {print $2}' test.txt
Print before RRR (leaving headers):
Code:
~$ awk -F" " '{print $2}' <(awk 'BEGIN{FS="RRR"}; NF>1 {print $1}' test.txt)
Print only headers:
Code:
~$ awk -F" " '{print $1}' <(awk 'BEGIN{FS="RRR"}; NF>1 {print $1}' test.txt)
Well, for more accurate answers, can you once specify sample output (as you want in all cases)?
 
Old 01-29-2013, 07:04 PM   #7
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,790

Rep: Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653
Just move the pieces around and change the anchor in the second grep:
Code:
egrep -o '[A-Z]{6}RRR' test.txt | egrep -o '^[A-Z]'
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] [bash] grep specific column? hashbang#! Programming 18 11-23-2011 09:29 AM
[SOLVED] awk with pipe delimited file (specific column matching and multiple pattern matching) lolmon Programming 4 08-31-2011 12:17 PM
Copy lines starting and ending with specific pattern from multiple files to a file ssn Linux - Newbie 2 07-27-2011 10:44 AM
[SOLVED] Replace pattern in specific lines and column with AWK cgcamal Programming 10 04-26-2010 01:11 AM
How to delete/grab a line which matchs a pattern of a particular column only ? mauludi Linux - Newbie 6 01-18-2010 05:52 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:40 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration