Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a
virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month.
Click here for more info.
12-27-2012, 03:19 AM
#1
Member
Registered: Dec 2012
Posts: 40
Rep:
Need help to extract substrings from a text file, based on patterns.
awk cat file.txt :
Code:
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797807950|mar0101|0|00000106829DAE7F3FAB187550B920530C00|0|0|4000018001000002||962797807950|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|1|||||||||||||0|0|||472|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|252|tid{111211344662580792}pfid{10}gob{1}rid{globitel} afid{}uid1{962797807950}aid1{1}ar1{100}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC RESERVE AMOUNT 10000}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{100}ctr{StaffLine}ftksn{JMT}ftksr{0001}ftktp{PayCall Ticket}||
hi alla i have this this file and i want to print the word "staffline" or any word in this brace "ctr{word}"
always ctr and brace and any word im want to print this word bye use substr function in awk
i want to print the word btween { }
"staffline " was just as example
any word
i try:
awk '{comp[substr["ctr",0]{print}}'
Last edited by colucix; 12-27-2012 at 06:33 AM .
12-27-2012, 03:29 AM
#2
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Is that what you're after (in sed)?
Code:
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' file
12-27-2012, 03:37 AM
#3
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
sycamorex thaaanks
it agreat code
but i want to do it by subtr function
12-27-2012, 03:47 AM
#4
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Quote:
Originally Posted by
eyadgh
sycamorex thaaanks
it agreat code
but i want to do it by subtr function
Are you sure you want to do it in awk? Awk predominantly works on columns and it's kind of hard to find a good column delimiter in your string.
12-27-2012, 03:51 AM
#5
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
im not sure but i sure i want to use "substr " function
12-27-2012, 05:03 AM
#6
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Quote:
Originally Posted by
eyadgh
ok i'm so sorry
ok forget substr but your dont work if we have many "ctr{word} " two or three
hint: we have many "ctr{word} " and i want to count the same word and print the count for each
No problem. Next time it'll be easier for people to help you if you describe the
whole situation/problem at the beginning.
1. Is it all in one long line or multiple lines?
2. If multiple lines, is there a specific number of instances of ctr{...} per line?
12-27-2012, 05:12 AM
#7
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
my friend like this :
and count the same words ??
thank you again for sufferance
Code:
962796057604|mar0101|0|00000107A20E00000A6C331650B920340C00|0|0|400019FD7DBFBF7F|1001|962796057604|0|01001|||-1|795971936|00962795971936|16||-1|00962795971936|-1|0|2|0|416019000659493|0||||||0|0|2012.12.01 00:07:09|12|30|0|516|16|1|2012.12.01 00:06:39|1|0||202|20001||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0714F610045584E6|000000000000|3|1|0000000000000000|0|140|0|0|0|0|0|0|||0|2|||||||||||||||||||||0|||0||0|1|143|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{Mo7afazat }cgpa{962796057604}vlr{0096279001300}cff{0}roaf{0}mpty{0}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796298894|mar0101|0|000001028225AE4AD868A8B750B900980C00|1|0|4000018001000002||962796298894|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||3797|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|244|tid{111210532409329884}pfid{20}gob{1}rid{globitel}afid{}uid1{962796298894}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{JaishanaIN }ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|100-50-0-962796605155|mar0101|0|00000102A20400000A6A439D50B920520C00|0|0|400019FD7DBFBF7F|1001|962796605155|16||||-1|b116c||16||-1||-1|0|0|0|416017002233360|0||||||0|0|1970.01.01 02:00:00|0|0|0|220|0|1|1970.01.01 02:00:00|1|0||194|0||000000000000000000000000000000000000|0|0||0|0||00000000000000000000||0000000000000000|000000000000|0|0|0000000000000000|0|370|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1|70|acf{3}ussd{1}ctr{ZainElKul }ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|100-10-0
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797611253|mar0101|0|0000010282B54BD015FF4C4B50B8F96E0C00|1|0|4000018001000002||962797611253|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||885|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|243|tid{111220371293561120}pfid{20}gob{1}rid{globitel}afid{}uid1{962797611253}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{ZainElKul }ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
-962795292027|mar0101|0|00000101A20200000A6A96B750B920300C00|0|0|400019FD7DBFBF7F|1001|962795292027|0|01004|||-1|797196452|00962797196452|16||-1|00962797196452|-1|0|2|0|416018002276781|0||||||0|0|2012.12.01 00:07:09|12|12|23|516|16|1|2012.12.01 00:06:34|1|0||202|1||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0714F6100455AD67|000000000000|3|1|0000000000000000|0|30|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1|171|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{ZainUnlimited }cgpa{962795292027}vlr{0096279001300}cff{0}roaf{0}mpty{0}cacc{1;0;30}cquo{1;230;}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796012818|mar0101|0|0000010882218115085D5F9150B920520C00|0|0|4000018001000002||962796012818|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|1|||||||||||||0|0|||70|0|0|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|258|tid{111221366974701289}pfid{17}gob{1}rid{globitel}afid{}uid1{962796012818}aid1{1}ar1{-2147483648}uid2{}aid2{-1}pid{DEFAULT_DECISION}pur{!GDRC Balance Check}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{AlBarakehNew }ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797251349|mar0101|0|0000010282A451483EDFCFD350B920400C00|1|0|4000018001000002||962797251349|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||440|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|245|tid{111211342745325133}pfid{20}gob{1}rid{globitel}afid{}uid1{962797251349}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{ZainElKulSN }ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-
Last edited by colucix; 12-27-2012 at 06:34 AM .
12-28-2012, 04:36 AM
#8
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
This is a quick and ugly and not very concise way but try this:
Code:
tr -d '[:cntrl:][0-9]| ' < file.txt | sed -n 's/ctr/\n/pg' | sed -n '/^{/s/{\(.[^}]*\).*/\1/p' | awk '{arr[$1]++} END {for(i in arr) print i,arr[i]}'
12-28-2012, 05:05 AM
#9
LQ Veteran
Registered: Sep 2003
Posts: 10,532
Also have a look at this:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
1 members found this post helpful.
12-28-2012, 05:09 AM
#10
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Quote:
Originally Posted by
druuna
Also have a look at this:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
I keep telling myself that I need to spend more time learning awk. Great stuff.
12-29-2012, 10:19 AM
#11
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
thanks all
i tried this code it good
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'
But it does not give the number of words repeated Similar and other other words not similar
12-29-2012, 10:26 AM
#12
LQ Veteran
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Quote:
Originally Posted by
eyadgh
thanks all
i tried this code it good
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'
But it does not give the number of words repeated Similar and other other words not similar
Have you tried Druuna's suggestion?
12-29-2012, 10:38 AM
#13
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
not yet wait
12-30-2012, 12:55 AM
#14
Member
Registered: Dec 2012
Posts: 40
Original Poster
Rep:
there is five errors in druuna's code
12-30-2012, 02:26 AM
#15
LQ Veteran
Registered: Sep 2003
Posts: 10,532
Quote:
Originally Posted by
eyadgh
there is five errors in druuna's code
Its always nice if the OP tells us in detail which errors s/he encountered!
Quote:
Originally Posted by eyadgh
i tried this code it good
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p'
But it does not give the number of words repeated Similar and other other words not similar
Using the above command:
Code:
$ sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' infile
Mo7afazat
JaishanaIN
ZainElKul
ZainElKul
ZainUnlimited
AlBarakehNew
ZainElKulSN
and using my command:
Code:
$ awk '/ctr/ { sub(/.*ctr{/,"",$0) ; sub(/}.*/,"",$0) ; _[$0]++ }END{for (i in _) print i " : " _[i]}' infile
ZainUnlimited : 1
AlBarakehNew : 1
ZainElKulSN : 1
Mo7afazat : 1
ZainElKul : 2
JaishanaIN : 1
Works like a charm with the example data given in post #7.
You either don't know how to use the command I gave or your description of the problem/sample data is not correct.
All times are GMT -5. The time now is 10:48 PM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News