LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-18-2014, 02:05 PM   #1
Rail89
LQ Newbie
 
Registered: Jun 2014
Posts: 3

Rep: Reputation: Disabled
Sorting Email list


Hi,
I have an email list that I would like to sort.
The problem is that emails can be of different sizes and won't always have the same number of columns.

Example

Hello@zmail.com
Hello.World@zmail.net
Hello@mail.zmail.org
Hello.World@mail.zmail.edu

This makes it difficult for me because I want to sort first by the .com/.net/.etc part.

Then by the domain name.

Then the username.

So this list:

Bill.Mattews@amail.com
Chester.Cheese@dmail.edu
David@other.ymail.com
Matt@zmail.edu
Carter@bmail.net
Edison@cmail.org
Jason@new.amail.com
Nathon.Apple@other.ymail.com
NoExcuses@bmail.net
Lana@bmail.com

Will look like this after sorting:

Bill.Mattews@amail.com
Jason@new.amail.com
Lana@bmail.com
David@other.ymail.com
Nathon.Apple@other.ymail.com
Chester.Cheese@dmail.edu
Matt@zmail.edu
Carter@bmail.net
NoExcuses@bmail.net
dison@cmail.org


Thanks.
 
Old 06-18-2014, 08:11 PM   #2
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,661

Rep: Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256
You could do it in three steps:

1. separate the various fields you want to sort by (by putting the extracted fields in front of the original line) using awk. In the case of the domain field, you would have to reverse the character order of the field.
2. sort by new fields
3. remove the new fields from the sorted file, outputting only the original data (also using awk).
 
Old 06-20-2014, 08:34 AM   #3
Rail89
LQ Newbie
 
Registered: Jun 2014
Posts: 3

Original Poster
Rep: Reputation: Disabled
jpollard,
From what you said I looked up awk and found this code:

Code:
awk 'BEGIN {FS="."; OFS="|"}{print$NF,$0}' test2 |sort -t"|" -k1
which gave me this result:

com|Bill.Mattews@amail.com
com|David@other.ymail.com
com|Jason@new.amail.com
com|Lana@bmail.com
com|Nathon.Apple@other.ymail.com
edu|Chester.Cheese@dmail.edu
edu|Matt@zmail.edu
net|Carter@bmail.net
net|NoExcuses@bmail.net
org|Edison@cmail.org

I believe this is part of step 1 that you mentioned but I don't understand when you said
Quote:
In the case of the domain field, you would have to reverse the character order of the field.
What exactly do you mean?
 
Old 06-20-2014, 09:59 AM   #4
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,661

Rep: Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256
You wanted things "abc.dom.net" and "dom.net" to be grouped together. To do that easily would require them to be sorted with reversed strings: "ten.mod" and "ten.mod.cba". This would put them together nearly appropriately.

An alternative (which is more work, but more accurate) would be to reverse the order of the names in the domain:
"net.dom.abc" and "net.dom" and then sort...

The intermediate file would be

name net.dom.abc name@abc.dom.net
name1 net.dom name1@abc.dom.net

This gives you three fields, the first two are just keys for sorting. The last phase would just be a simple awk script that prints the third column.

The resulting sorted intermediate file would be:

name1 net.dom name1@dom.net
name net.dom.abc name@abc.dom.net

where the domain field is the primary key, and the name field the secondary key. As a side note, it would even be possible to switch the order such that the name field is second, but that is arbitrary as far as sort goes.

The two key fields would have to be specified separately so that the sort would start the strings in the proper column.
 
1 members found this post helpful.
Old 06-26-2014, 09:22 AM   #5
Rail89
LQ Newbie
 
Registered: Jun 2014
Posts: 3

Original Poster
Rep: Reputation: Disabled
jpollard,

I tried to reverse the order of the names in the domains and I came up with this code:

First I move everything after the "@" sign to a different file.

Code:
awk -F'@' '{print $2}' $FILE > temp1
From the example list of emails I gave before this line of code gave me this as result.

amail.com
dmail.edu
other.ymail.com
zmail.edu
bmail.net
cmail.org
new.amail.com
other.ymail.com
bmail.net
bmail.com

I then tried to reverse the order of the names, I found this code online and I'm still trying to figure out how it works.

Code:
awk -F"." '{n=split($0,F); for(i in F) $i=F[n-i+1]}1' temp1 > temp2
This line of code gave me this as a result.

com amail
edu dmail
com ymail other
edu zmail
net bmail
org cmail
com amail new
com ymail other
net bmail
com bmail

I then pasted temp2 and the orginal file to another file with "|" as a seperator

Code:
paste -d'|' temp2 $FILE > temp3
com amail|Bill.Mattews@amail.com
edu dmail|Chester.Cheese@dmail.edu
com ymail other|David@other.ymail.com
edu zmail|Matt@zmail.edu
net bmail|Carter@bmail.net
org cmail|Edison@cmail.org
com amail new|Jason@new.amail.com
com ymail other|Nathon.Apple@other.ymail.com
net bmail|NoExcuses@bmail.net
com bmail|Lana@bmail.com

I then sorted by the first column and then by the second column, removed the first column and placed it into another file.

Code:
sort -f -t'|' -k1,1 -k2,2 temp3 | awk -F'|' '{print $2}' > FileSorted.txt

Bill.Mattews@amail.com
Jason@new.amail.com
Lana@bmail.com
David@other.ymail.com
Nathon.Apple@other.ymail.com
Chester.Cheese@dmail.edu
Matt@zmail.edu
Carter@bmail.net
NoExcuses@bmail.net
Edison@cmail.org


I placed all the code in a script so that I didn't have to type each line of code every time.

Thanks for your help, I would never of though of reversing the order of the names if you didn't mention it.
 
Old 06-26-2014, 09:08 PM   #6
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,661

Rep: Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256Reputation: 1256
Quote:
Originally Posted by Rail89 View Post
jpollard,

I tried to reverse the order of the names in the domains and I came up with this code:

First I move everything after the "@" sign to a different file.

Code:
awk -F'@' '{print $2}' $FILE > temp1
From the example list of emails I gave before this line of code gave me this as result.

amail.com
dmail.edu
other.ymail.com
zmail.edu
bmail.net
cmail.org
new.amail.com
other.ymail.com
bmail.net
bmail.com

I then tried to reverse the order of the names, I found this code online and I'm still trying to figure out how it works.

Code:
awk -F"." '{n=split($0,F); for(i in F) $i=F[n-i+1]}1' temp1 > temp2
The line "n=split($0,F)", does two things:
1) the array F gets the array created from the domain names as split by the field separator
2) the n gets the number of elements in the array

the expression "n - i + 1" is used to process the elements of the array in reverse order.

If you look at the "awk" man page, you will see that split has two optional parameters - a parameter to specify a pattern to use when splitting the string. In your case, you would want to split based on the ".", so it would be the pattern "/\./" (the other optional parameter is for an array to receive the character the split for the corresponding element). Using the extra parameter would eliminate the use of another array.

Quote:
This line of code gave me this as a result.

com amail
edu dmail
com ymail other
edu zmail
net bmail
org cmail
com amail new
com ymail other
net bmail
com bmail

I then pasted temp2 and the orginal file to another file with "|" as a seperator

Code:
paste -d'|' temp2 $FILE > temp3
com amail|Bill.Mattews@amail.com
edu dmail|Chester.Cheese@dmail.edu
com ymail other|David@other.ymail.com
edu zmail|Matt@zmail.edu
net bmail|Carter@bmail.net
org cmail|Edison@cmail.org
com amail new|Jason@new.amail.com
com ymail other|Nathon.Apple@other.ymail.com
net bmail|NoExcuses@bmail.net
com bmail|Lana@bmail.com

I then sorted by the first column and then by the second column, removed the first column and placed it into another file.

Code:
sort -f -t'|' -k1,1 -k2,2 temp3 | awk -F'|' '{print $2}' > FileSorted.txt

Bill.Mattews@amail.com
Jason@new.amail.com
Lana@bmail.com
David@other.ymail.com
Nathon.Apple@other.ymail.com
Chester.Cheese@dmail.edu
Matt@zmail.edu
Carter@bmail.net
NoExcuses@bmail.net
Edison@cmail.org


I placed all the code in a script so that I didn't have to type each line of code every time.

Thanks for your help, I would never of though of reversing the order of the names if you didn't mention it.
No problem. I once had a similar problem with a cross reference listing, and had to do the same type of thing, though mine didn't have as many elements in the name.

Last edited by jpollard; 06-26-2014 at 09:13 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sorting list in perl rocky9 Programming 3 04-16-2012 06:56 AM
[SOLVED] list sorting in python aihaike Programming 4 05-12-2011 12:20 AM
Array to List, sorting (Java) Asido Programming 2 08-15-2010 11:49 AM
sorting a list into comma separated list nixlearn Linux - Newbie 22 12-03-2008 06:21 AM
Sorting a list of words in LISP! Hady Programming 1 05-01-2004 03:29 PM


All times are GMT -5. The time now is 12:40 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration