LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-23-2012, 02:34 PM   #1
oliviaxinw
LQ Newbie
 
Registered: Jul 2012
Posts: 3

Rep: Reputation: Disabled
Arrange output by frequency of occurrence


Hello,

I have a file with entries like:
A
A
B
C
A
B
B
A
...

I want the output to look something like: (arrange by how many times a particular entry occurs and write down how many times it occurs)
A 4
B 3
C 1

What code should I use? My file is very large and have tens of thousands of different entries, so it's not practical for me to define individually what A, B, C...are.

Thank you =)
 
Old 08-23-2012, 02:52 PM   #2
byannoni
Member
 
Registered: Aug 2012
Location: /home/byannoni
Distribution: Arch
Posts: 128

Rep: Reputation: 36
Code:
uniq -c file | sort -nr
 
Old 08-23-2012, 04:27 PM   #3
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,026

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
you would need to sort before you uniq
Code:
sort olivia.txt | uniq -c | sort -n -r # | awk '{print $2 " " $1}'
 
Old 08-23-2012, 05:33 PM   #4
byannoni
Member
 
Registered: Aug 2012
Location: /home/byannoni
Distribution: Arch
Posts: 128

Rep: Reputation: 36
Quote:
Originally Posted by schneidz View Post
you would need to sort before you uniq
Code:
sort olivia.txt | uniq -c | sort -n -r # | awk '{print $2 " " $1}'
Did you try mine before saying what it needs? It works fine and the output is the same as yours.
 
Old 08-23-2012, 05:38 PM   #5
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,026

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
no i didnt, i'll try it now... i always learned that a sort was necessary before a uniq since uniq counts consecutive rows.

edit:
i think i'm rite:
Code:
[schneidz@hyper abg]$ uniq -c olivia.txt | sort -nr
      2 B
      2 A
      1 C
      1 B
      1 A
      1 A
[schneidz@hyper abg]$ sort olivia.txt | uniq -c | sort -n -r | awk '{print $2 " " $1}'
A 4
B 3
C 1
[schneidz@hyper abg]$ uniq --version
uniq (GNU coreutils) 8.10
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Richard M. Stallman and David MacKenzie.
[schneidz@hyper abg]$ sort --version
sort (GNU coreutils) 8.10
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.

Last edited by schneidz; 08-23-2012 at 05:42 PM.
 
Old 08-23-2012, 06:09 PM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Here's a slightly more advanced gawk solution.

Code:
gawk 'BEGIN{ PROCINFO["sorted_in"]="@val_num_desc" } { a[$1]+=1 } END{ for ( i in a ) { print i,a[i] } }'  infile.txt
It counts the number of each entry using an array, and prints the results out at the end.

Sorting is handled by a PROCINFO setting, which is why it requires a recent version.

For older versions of gawk or other awks, remove the BEGIN section and just pipe the output through "sort -rn -k2".

Last edited by David the H.; 08-23-2012 at 06:10 PM.
 
Old 08-23-2012, 06:09 PM   #7
byannoni
Member
 
Registered: Aug 2012
Location: /home/byannoni
Distribution: Arch
Posts: 128

Rep: Reputation: 36
Quote:
Originally Posted by schneidz View Post
no i didnt, i'll try it now... i always learned that a sort was necessary before a uniq since uniq counts consecutive rows.

edit:
i think i'm rite:
Code:
[schneidz@hyper abg]$ uniq -c olivia.txt | sort -nr
      2 B
      2 A
      1 C
      1 B
      1 A
      1 A
[schneidz@hyper abg]$ sort olivia.txt | uniq -c | sort -n -r | awk '{print $2 " " $1}'
A 4
B 3
C 1
[schneidz@hyper abg]$ uniq --version
uniq (GNU coreutils) 8.10
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Richard M. Stallman and David MacKenzie.
[schneidz@hyper abg]$ sort --version
sort (GNU coreutils) 8.10
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.
I stand corrected. Sorry about that, I should have tested with values that weren't already in order.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] cant arrange desktop icons the way i want on kde? jason_lee_91 Linux - Newbie 3 04-24-2011 03:56 PM
How to Auto-arrange Desktop Icons malekmustaq Linux - Newbie 2 08-16-2010 07:50 AM
Arrange device for multiple OSs RaptorX Linux - General 1 08-18-2009 01:19 AM
How to arrange NIC accordingly ? makubex Linux - Networking 2 11-29-2007 11:48 PM
KDE 3.3 Menus - arrange alphabetically equinox SUSE / openSUSE 9 11-28-2005 07:25 AM


All times are GMT -5. The time now is 09:58 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration