Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
04-27-2012, 03:08 AM
|
#1
|
LQ Newbie
Registered: Apr 2012
Posts: 2
Rep: 
|
Grep Command with Unique Records
Hi,
I am facing a problem.
I have a Sample output of a File myfile
Sample:
Apr 27 02:02:36 MEDIA 14 LINK 3 : AVAILABLE
Apr 27 02:02:36 MEDIA 14 LINK 7 : AVAILABLE
Apr 27 02:02:37 MEDIA 14 LINK 0 : AVAILABLE
Apr 27 02:02:37 MEDIA 14 LINK 1 : AVAILABLE
Apr 27 02:02:37 MEDIA 14 LINK 2 : AVAILABLE
Apr 27 02:02:37 MEDIA 12 LINK 4 : AVAILABLE
Apr 27 02:02:37 MEDIA 14 LINK 5 : AVAILABLE
Apr 27 02:02:37 MEDIA 14 LINK 6 : AVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 0 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 03 LINK 1 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 2 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 3 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 11 LINK 4 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 5 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 6 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 LINK 7 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 13 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 14 LINK 0 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 14 LINK 2 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 07 LINK 5 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 14 LINK 6 : UNAVAILABLE
Apr 27 04:43:20 MEDIA 14 LINK 7 : UNAVAILABLE
Apr 27 04:43:23 MEDIA 13 : AVAILABLE
The output I want is:
MEDIA 14
MEDIA 12
MEDIA 13
MEDIA 03
MEDIA 11
MEDIA 07
i.e Unique Media No.
Please Help.
|
|
|
04-27-2012, 03:19 AM
|
#2
|
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,446
|
Try something with associative arrays - awk maybe.
We're here to help, not write it for you. Show us what you try and what difficulties stump you.
|
|
|
04-27-2012, 03:22 AM
|
#3
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,671
|
awk ' { print $4 " " $5 } ' filename | sort -u
yes, it will also sort, but I do not think it is a problem for you
|
|
|
04-27-2012, 03:25 AM
|
#4
|
LQ Veteran
Registered: Sep 2003
Posts: 10,532
|
Hi,
Using arrays might be too complicated. Use awk's pattern search, print appropriate fields and pipe it through sort.
Hope this helps.
EDIT: @pan64: That will print all the fields records, not just those that are UNAVAILABLE
Last edited by anon237; 04-27-2012 at 03:27 AM.
|
|
|
04-27-2012, 03:34 AM
|
#5
|
Member
Registered: Jan 2009
Distribution: Debian
Posts: 59
Rep:
|
Quote:
Originally Posted by druuna
Hi,
Using arrays might be too complicated. Use awk's pattern search, print appropriate fields and pipe it through sort.
Hope this helps.
EDIT: @pan64: That will print all the fields records, not just those that are UNAVAILABLE
|
Druuna,
PAN64 is correct. awk has a default field separator of <space>. With this, by printing field 4 and 5, he will get the required fields.
PAN64: I had forgotten about -u on the sort, thank you for the reminder. The -u tells sort to print only unique records. Saves you piping the output to uniq.
Now the Unavalible:
Code:
awk '{if ($9 ~ /UNAVAILABLE/) print $4 " " $5}' | sort -u
Here is a link to some more information - http://www.math.utah.edu/docs/info/gawk_5.html
Last edited by Nermal; 04-27-2012 at 03:42 AM.
Reason: Links and regexp for UNAVALIBLE
|
|
|
04-27-2012, 03:38 AM
|
#6
|
LQ Newbie
Registered: Apr 2012
Posts: 2
Original Poster
Rep: 
|
Thanx all for Quick Replies.
I have used "cat myfile |grep 'Apr 27.*MEDIA.*' " already to get to this output(Sample).
to get Unique Records i have used
"cat myfile |grep 'Apr 27.*MEDIA.*'|uniq " but i knw this will not work.
Any help is always useful..
|
|
|
04-27-2012, 03:43 AM
|
#7
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,671
|
fist, just use
grep 'Apr 27.*MEDIA.*' myfile
without cat and pipe.
second, uniq does not work because those lines contain times. So awk will drop unnecessary parts (awk can also be used instead of grep), and uniq or sort will do the rest.
|
|
|
04-27-2012, 03:46 AM
|
#8
|
LQ Veteran
Registered: Sep 2003
Posts: 10,532
|
@Nermal: pan64 isn't correct, he forgot to add the /STRING/ part. I was not talking about awk's separator
Code:
awk '/UNAVAILABLE/ { print $4, $5 }' infile | sort -u
Your solution also works (if you provide input).
Hope this helps.
|
|
|
04-27-2012, 03:58 AM
|
#9
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,671
|
Quote:
Originally Posted by druuna
@Nermal: pan64 isn't correct, he forgot to add the /STRING/ part. I was not talking about awk's separator
Code:
awk '/UNAVAILABLE/ { print $4, $5 }' infile | sort -u
|
I can't see if AVAILABLE or UNAVAILABLE counts, and also what about the line MEDIA 07 and MEDIA 12 in the result (see first post example).
|
|
|
04-27-2012, 04:01 AM
|
#10
|
LQ Veteran
Registered: Sep 2003
Posts: 10,532
|
@pan64: You might have a good point!
I might have assumed incorrectly, maybe tarunshrivastav can answer that one.
|
|
|
04-27-2012, 04:58 AM
|
#11
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,042
|
Why not just make it easy:
Code:
awk '!_[$4$5]++{print $4,$5}' file
Sort if you need to.
|
|
|
04-27-2012, 05:50 AM
|
#12
|
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,446
|
Is it midnight in Perth already ???  _ 
|
|
|
04-27-2012, 05:53 AM
|
#13
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,042
|
I can come up with some things without it being too late 
|
|
|
04-27-2012, 06:47 AM
|
#14
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,671
|
Quote:
Originally Posted by grail
Code:
awk '!_[$4$5]++{print $4,$5}' file
|
Code:
awk '/Apr 27.*MEDIA/ && !_[$4$5]++ {print $4,$5}' file
and we do not need the grep also
|
|
|
04-27-2012, 06:56 AM
|
#15
|
Member
Registered: Apr 2009
Location: Melbourne
Distribution: Fedora & CentOS
Posts: 854
Rep: 
|
Since there is this useless cat hanging around, I'll chime in with another way to skin it.
grail's example is obviously better, but i sometimes feel cut is under appreciated in favour of sed/awk etc so I thought id chime in with an example using cut.
Code:
fukawi1 ~/tmp # cut -f4-5 -d' ' data | sort -u
MEDIA 03
MEDIA 07
MEDIA 11
MEDIA 12
MEDIA 13
MEDIA 14
fukawi1 ~/tmp # cut -f4-5 -d' ' data | awk '!($0 in a){a[$0];print}'
MEDIA 14
MEDIA 12
MEDIA 13
MEDIA 03
MEDIA 11
MEDIA 07
The awk bit was half inched from http://www.pement.org/awk/awk1line.txt
|
|
|
All times are GMT -5. The time now is 11:08 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|