LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-11-2012, 03:15 AM   #1
keerthika
LQ Newbie
 
Registered: Apr 2012
Posts: 3

Rep: Reputation: Disabled
Unhappy print pattern matching lines until immediate occurence of a character


hi unix genies,

I am in process of automating some stuff.

I need to print lines matching a pattern until the immediate occurence of a semicolon.

For eg

i have my test.txt as

out.clse_dt :: in.clse_dt;
out.acct_sts_cd :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') '"'"'R'"'"'
else in0.acct_sts_cd;
out.clse_rsn_cd :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') '"'"'P'"'"'
else in0.clse_rsn_cd;
out.clse_dt :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') $PRD_END_DATE0
else in0.clse_dt;




Note: the last else in0.clse_dt is in a separate line

i need the output as below (my pattern is out.clse_dt)
out.clse_dt :: in.clse_dt;
out.clse_dt :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') $PRD_END_DATE_0
else in0.clse_dt;



Normal grep,sed or awk with out.clse_dt as pattern is giving me only
out.clse_dt :: in.clse_dt;
out.clse_dt :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') $PRD_END_DATE_0 (only the line with pattern match)


But i need the line until it finds immediate ; even though ; is not in the line with the pattern.

I guess u got my question.Pls help me with the same as i am new to unix.
 
Old 04-11-2012, 03:58 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
Please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.

So to clarify, what you want is everything from the "out.close_dt" pattern to the next semicolon.

That's a bit tricky to do with sed, but I think I've managed it.

Code:
sed -n '/^out[.]clse_dt/ { :a ; /[;]/! {N; ba} ; p }'
You need to set up a loop and some nested commands.

Start with -n to silence output, and tell it to match the pattern you want.

The first set of nested subcommands {..} processes that matched line. It sets up a loop marker :a, then checks the line to see if it has a semicolon.

If it doesn't find one, It runs the second nested section, which adds the Next line to the pattern buffer, then branches back to the beginning.

If it does find one, it skips the loop and goes on to print the buffer contents.


I'm sure the basic logic can be extended to awk or other languages as well.


Edit: I just realized that this only works to print out the entire lines, so it basically assumes the semicolon will be at the end of the line. If it (or even multiples) could occur in the middle of the line, then you'll have to also add an additional expression to strip off the extra text.

Code:
sed -n '/^out[.]clse_dt/ { :a ; /[;]/! {N; ba} ; s/[;].*$/;/p }'

Last edited by David the H.; 04-11-2012 at 04:10 AM. Reason: as stated
 
Old 04-11-2012, 05:17 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,424

Rep: Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823Reputation: 2823
OR some awk:
Code:
awk 'BEGIN{RS=";\n";ORS=""}/out.clse_dt/{print $0 RT}' file
 
1 members found this post helpful.
Old 04-11-2012, 06:12 AM   #4
keerthika
LQ Newbie
 
Registered: Apr 2012
Posts: 3

Original Poster
Rep: Reputation: Disabled
Smile

Thanks a ton Grail,it worked ,i used ; as ORS as i need the output along with semicolons,now i want the lines that only match the pattern exactly.

i.e i want only lines like out.clse_dt::in.clse_dt; and not out_clse_dt_cd:in.clse_dt;

In short,I want to add grep -w functionality to search for exact pattern.

What change has to be done to the below code

awk 'BEGIN{RS=";\n";ORS=";"}/out.clse_dt/{print $0 RT}' test.txt
 
Old 04-11-2012, 06:34 AM   #5
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 8,509

Rep: Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434
Code:
awk -v search_pattern='some search string' 'BEGIN {RS=";\n";ORS=";"} match($0, search_pattern) {print $0 RT}' test.txt
or you can write a simple script:
Code:
#!/bin/bash
awk -v search_pattern="$1" 'BEGIN {RS=";\n";ORS=";"} match($0, search_pattern) {print $0 RT}' test.txt
and it will work like the grep command:
script_name search_pattern
 
Old 04-11-2012, 06:49 AM   #6
keerthika
LQ Newbie
 
Registered: Apr 2012
Posts: 3

Original Poster
Rep: Reputation: Disabled
Sorry when i use
awk -v search_pattern='out.clse_dt' 'BEGIN {RS=";\n";ORS=";"} match($0, search_pattern) {print $0 RT}' test.txt

I still get
out.clse_dt_cd :: in.clse_dt;
out.clse_dt :: if (is_defined(in1) && in0.acct_sts_cd != '"'"'R'"'"') $PRD_END_DATE_0
else in0.clse_dt;

I want lines matching only out.clse_dt and not ****out.clse_dt****.... where * stands for any character
 
Old 04-11-2012, 06:54 AM   #7
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 8,509

Rep: Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434Reputation: 2434
modify search pattern, use:
out.clse_dt :
for example.
In general please read the man page of awk on how to write a search expression

Last edited by pan64; 04-11-2012 at 06:57 AM. Reason: mistype
 
Old 04-11-2012, 06:58 AM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
So, just extend your regex pattern to check the next character as well.

Code:
/out[.]clse_dt[^_]/	#to exclude underscore

/out[.]clse_dt[ :]/	#to include a space or colon
Which one you use would depend on if it's easier to exclude the characters you don't want, or include the ones that you do.

I would suggest the \b or \> word boundary anchors, but since a "word" is defined in regex as [:alnum:] plus underscore, it wouldn't exclude out.clse_dt_cd.

PS. don't forget that "." is special in regex, so you need to escape/bracket expression it too, as I did above.

PSS: Don't forget to use [code][/code] tags around your code and data, as I asked you to do earlier.

Last edited by David the H.; 04-11-2012 at 07:03 AM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk with pipe delimited file (specific column matching and multiple pattern matching) lolmon Programming 4 08-31-2011 01:17 PM
help extracting a matching pattern and next lines of match madvicious Programming 8 09-13-2009 02:01 AM
Sed command to print matching lines and 2 lines above.. DX398 Programming 12 10-01-2008 09:25 AM
AWK/SED Multiple pattern matching over multiple lines issue GigerMalmensteen Programming 15 12-03-2006 06:08 PM
awk print lines that doesn't have a pattern huynguye Programming 5 05-04-2006 12:08 PM


All times are GMT -5. The time now is 08:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration