LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 12-27-2012, 03:19 AM   #1
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Rep: Reputation: Disabled
Need help to extract substrings from a text file, based on patterns.


awk cat file.txt :

Code:
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797807950|mar0101|0|00000106829DAE7F3FAB187550B920530C00|0|0|4000018001000002||962797807950|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|1|||||||||||||0|0|||472|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|252|tid{111211344662580792}pfid{10}gob{1}rid{globitel} afid{}uid1{962797807950}aid1{1}ar1{100}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC RESERVE AMOUNT 10000}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{100}ctr{StaffLine}ftksn{JMT}ftksr{0001}ftktp{PayCall Ticket}||


hi alla i have this this file and i want to print the word "staffline" or any word in this brace "ctr{word}"
always ctr and brace and any word im want to print this word bye use substr function in awk

i want to print the word btween { }
"staffline " was just as example
any word


i try:
awk '{comp[substr["ctr",0]{print}}'

Last edited by colucix; 12-27-2012 at 06:33 AM.
 
Old 12-27-2012, 03:29 AM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Is that what you're after (in sed)?

Code:
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' file
 
Old 12-27-2012, 03:37 AM   #3
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
sycamorex thaaanks
it agreat code
but i want to do it by subtr function
 
Old 12-27-2012, 03:47 AM   #4
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Quote:
Originally Posted by eyadgh View Post
sycamorex thaaanks
it agreat code
but i want to do it by subtr function
Are you sure you want to do it in awk? Awk predominantly works on columns and it's kind of hard to find a good column delimiter in your string.
 
Old 12-27-2012, 03:51 AM   #5
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
im not sure but i sure i want to use "substr " function
 
Old 12-27-2012, 05:03 AM   #6
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Quote:
Originally Posted by eyadgh View Post
ok i'm so sorry
ok forget substr but your dont work if we have many "ctr{word} " two or three
hint: we have many "ctr{word} " and i want to count the same word and print the count for each
No problem. Next time it'll be easier for people to help you if you describe the whole situation/problem at the beginning.

1. Is it all in one long line or multiple lines?
2. If multiple lines, is there a specific number of instances of ctr{...} per line?
 
Old 12-27-2012, 05:12 AM   #7
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
my friend like this :
and count the same words ??
thank you again for sufferance


Code:
962796057604|mar0101|0|00000107A20E00000A6C331650B920340C00|0|0|400019FD7DBFBF7F|1001|962796057604|0|01001|||-1|795971936|00962795971936|16||-1|00962795971936|-1|0|2|0|416019000659493|0||||||0|0|2012.12.01 00:07:09|12|30|0|516|16|1|2012.12.01 00:06:39|1|0||202|20001||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0714F610045584E6|000000000000|3|1|0000000000000000|0|140|0|0|0|0|0|0|||0|2|||||||||||||||||||||0|||0||0|1|143|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{Mo7afazat}cgpa{962796057604}vlr{0096279001300}cff{0}roaf{0}mpty{0}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796298894|mar0101|0|000001028225AE4AD868A8B750B900980C00|1|0|4000018001000002||962796298894|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||3797|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|244|tid{111210532409329884}pfid{20}gob{1}rid{globitel}afid{}uid1{962796298894}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{JaishanaIN}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|100-50-0-962796605155|mar0101|0|00000102A20400000A6A439D50B920520C00|0|0|400019FD7DBFBF7F|1001|962796605155|16||||-1|b116c||16||-1||-1|0|0|0|416017002233360|0||||||0|0|1970.01.01 02:00:00|0|0|0|220|0|1|1970.01.01 02:00:00|1|0||194|0||000000000000000000000000000000000000|0|0||0|0||00000000000000000000||0000000000000000|000000000000|0|0|0000000000000000|0|370|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1|70|acf{3}ussd{1}ctr{ZainElKul}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|100-10-0
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797611253|mar0101|0|0000010282B54BD015FF4C4B50B8F96E0C00|1|0|4000018001000002||962797611253|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||885|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|243|tid{111220371293561120}pfid{20}gob{1}rid{globitel}afid{}uid1{962797611253}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{ZainElKul}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||

-962795292027|mar0101|0|00000101A20200000A6A96B750B920300C00|0|0|400019FD7DBFBF7F|1001|962795292027|0|01004|||-1|797196452|00962797196452|16||-1|00962797196452|-1|0|2|0|416018002276781|0||||||0|0|2012.12.01 00:07:09|12|12|23|516|16|1|2012.12.01 00:06:34|1|0||202|1||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0714F6100455AD67|000000000000|3|1|0000000000000000|0|30|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1|171|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{ZainUnlimited}cgpa{962795292027}vlr{0096279001300}cff{0}roaf{0}mpty{0}cacc{1;0;30}cquo{1;230;}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796012818|mar0101|0|0000010882218115085D5F9150B920520C00|0|0|4000018001000002||962796012818|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|1|||||||||||||0|0|||70|0|0|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|258|tid{111221366974701289}pfid{17}gob{1}rid{globitel}afid{}uid1{962796012818}aid1{1}ar1{-2147483648}uid2{}aid2{-1}pid{DEFAULT_DECISION}pur{!GDRC Balance Check}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{AlBarakehNew}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797251349|mar0101|0|0000010282A451483EDFCFD350B920400C00|1|0|4000018001000002||962797251349|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||440|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|245|tid{111211342745325133}pfid{20}gob{1}rid{globitel}afid{}uid1{962797251349}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{ZainElKulSN}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-

Last edited by colucix; 12-27-2012 at 06:34 AM.
 
Old 12-28-2012, 04:36 AM   #8
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
This is a quick and ugly and not very concise way but try this:

Code:
tr -d '[:cntrl:][0-9]| ' < file.txt | sed -n 's/ctr/\n/pg' | sed -n '/^{/s/{\(.[^}]*\).*/\1/p' | awk '{arr[$1]++} END {for(i in arr) print i,arr[i]}'
 
Old 12-28-2012, 05:05 AM   #9
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373
Also have a look at this:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile 
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
 
1 members found this post helpful.
Old 12-28-2012, 05:09 AM   #10
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Quote:
Originally Posted by druuna View Post
Also have a look at this:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile 
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
I keep telling myself that I need to spend more time learning awk. Great stuff.
 
Old 12-29-2012, 10:19 AM   #11
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
thanks all
i tried this code it good

sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'

But it does not give the number of words repeated Similar and other other words not similar
 
Old 12-29-2012, 10:26 AM   #12
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,563
Blog Entries: 1

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Quote:
Originally Posted by eyadgh View Post
thanks all
i tried this code it good

sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'

But it does not give the number of words repeated Similar and other other words not similar
Have you tried Druuna's suggestion?
 
Old 12-29-2012, 10:38 AM   #13
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
not yet wait
 
Old 12-30-2012, 12:55 AM   #14
eyadgh
Member
 
Registered: Dec 2012
Posts: 40

Original Poster
Rep: Reputation: Disabled
there is five errors in druuna's code
 
Old 12-30-2012, 02:26 AM   #15
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373Reputation: 2373
Quote:
Originally Posted by eyadgh View Post
there is five errors in druuna's code
Its always nice if the OP tells us in detail which errors s/he encountered!

Quote:
Originally Posted by eyadgh
i tried this code it good
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'

But it does not give the number of words repeated Similar and other other words not similar
Using the above command:
Code:
$ sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' infile
Mo7afazat
JaishanaIN
ZainElKul
ZainElKul
ZainUnlimited
AlBarakehNew
ZainElKulSN
and using my command:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile 
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
Works like a charm with the example data given in post #7.

You either don't know how to use the command I gave or your description of the problem/sample data is not correct.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Modify the only one pattern among two patterns in a text file linuxromeo Linux - Newbie 3 11-22-2010 03:43 AM
Prompt the user for a file to open, extract the XML and write to another text file. richiep Linux - Newbie 7 10-22-2010 03:34 PM
Extract certain text info from text file xmrkite Linux - Software 30 02-26-2008 11:06 AM
search / count unique patterns in text file logicalfuzz Linux - Newbie 2 10-14-2006 07:58 AM


All times are GMT -5. The time now is 12:17 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration