LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-20-2010, 10:57 PM   #1
hattori.hanzo
Member
 
Registered: Aug 2006
Posts: 168

Rep: Reputation: 15
Substitute 2 character country abbreviations to 3


I have a list of 2 character country abbreviations used in the Maxmind GeoCountry database that I need to convert to the standard 3 character country abbreviation ISO 3166-11 format.

I could have a long list of sed substitute statements for each country.

Code:
sed "s/US/USA/g" <file>
Is their a more elegant way to do this? Maybe an array of some sort?

This substitution will be used in a bash script.

Thanks & Regards,
 
Old 12-21-2010, 04:30 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
If we could see an example of the full input format we could probably come up with some other options. A loop and/or case statement using bash parameter substitution, for example, or an awk script.

At the very least, you could create a script of sed substitution expressions, instead of running it as a series of individual commands.

In the end though, each of the possible codes will probably have to be set up individually somehow.
 
1 members found this post helpful.
Old 12-21-2010, 05:28 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I would suggest having a file with the change you require and then looping over it:
Code:
US USA
AU AUS
...
Also be wary of your sed as it would be most unfortunate to get something like:
Code:
USASR
If USSR were somewhere in the file
 
1 members found this post helpful.
Old 12-22-2010, 11:03 PM   #4
hattori.hanzo
Member
 
Registered: Aug 2006
Posts: 168

Original Poster
Rep: Reputation: 15
Thanks for the pointer.

Here is the test data:
Code:
[me@host test]$ more country.list
HK|HKG
US|USA
AU|AUS
[me@host test]$ more country.tmp
HK
US
JP
AU
Using awk from an example here.
Code:
[me@host test]$ awk 'BEGIN {FS=OFS="|"} FNR==NR{a[$1]=$2;next} $1 in a{print $2}' country.tmp country.list
Output
Code:
HKG
USA
AUS
If a country abbreviation is in country.tmp and not in country.list how could I display 'unknown' for that entry?

Update

Example taken from here which seems to work.

Code:
cat country.tmp | while read num; do NAME=`awk -F "|" '$1=="'"$num"'" {print $2}' country.list`; [ -z $NAME ] && echo "unknown" || echo $NAME; done
Output

Code:
HKG
USA
unknown
AUS
Thanks & Regards

Last edited by hattori.hanzo; 12-22-2010 at 11:33 PM.
 
Old 12-23-2010, 12:10 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Well I would probably stick with one or the other, ie either awk or bash.

awk
Code:
awk 'BEGIN{FS="|"}FNR==NR{a[$1]=$2;next}{if($1 in a)print a[$1];else print "unknown"}' country.tmp country.list
bash
Code:
declare -A arr

while IFS="|" read -r n2 n3
do
    arr[$n2]=$n3
done<country.tmp

while read -r line
do
    [[ -n ${arr[$line]} ]] && echo ${arr[$line]} || echo unknown
done<country.list
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed substitute everything until character sqn Programming 5 03-30-2010 10:27 AM
Apache BLOCK Country + Show Index for the Specific Country > How? skate Linux - Software 1 10-12-2009 07:08 AM
Firefox usage share, country for country! EliasAlucard Linux - Software 6 05-09-2006 05:05 PM
What are all these abbreviations/acronyms anyway? vharishankar General 1 09-18-2004 04:54 AM
Searching on abbreviations and anagrams Bert LQ Suggestions & Feedback 1 09-12-2002 01:17 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:02 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration