Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a
virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month.
Click here for more info.
07-12-2007, 07:02 AM
#1
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Rep:
multiple character replacement by shell script
hello,
I want to write a shell script to convert text file which contains phonetically written Tamil text into Tamil Unicode text file.
It is just like this.
first, need to search all patterns
say, 'aa'
then replace that pattern intu unicode charecter "அ"
please help me
07-12-2007, 07:38 AM
#2
LQ Veteran
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809
Have you read up on any of the standard utilities---eg sed and awk?
One common solution is the sed "substitute" command. Suppose you wanted to replace all instances of "aa" with "bb":
sed 's/aa/bb/g' <oldfile >newfile
I am sure there is a way to put in a hex byte instead of "bb", but I don't have it in front of me.
Really good sed, awk, and other tutorials here:
http://www.grymoire.com/Unix/
07-12-2007, 01:31 PM
#3
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
pixellany,
Thank you very much.
I got the idea from your reply and made a small script to achieve a demo conversion.
here is my script.
Quote:
sed -e 's/ma/ம/g' -e 's/yuu/யூ/g' -e 's/ra/ர/g' -e 's/n/ன்/g' < mauran >mauran1
but it gave me a new problem.
output file contains some complicated utf-8 mess like this,
does > operator handles Unicode well?
07-12-2007, 01:35 PM
#4
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
update!
i saved the output file as file.html and open it in Firefox. when I make firefox's encoding as utf-8, it shows characters correctly.
so the problem is,
how to handle utf-8 encoding in sed and >.
07-12-2007, 02:15 PM
#5
LQ Veteran
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809
The "<" ">" operators are for redirection and do not care what the data encoding is.
How did you get the special characters into the sed commands? You may need to run some experiments to see which characters get correctly handled by sed. (I've never seen anything about this in the various books on sed.)
Did you find anything on how to input raw hex bytes using sed?
07-12-2007, 02:45 PM
#6
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
I'm directly input Tamil charecters into bash script using scim input method.
I've almost finished my code.
now I'm redirecting the output to a html file.
html file can be handled easily in these encoding stuff.
here is my code
Quote:
#/bin/bash!
filename=`zenity --file-selection`
sed -e 's/Xau/க்ஷௌ/g' -e 's/Xai/க்ஷை/g' -e 's/Xaa/க்ஷா/g' -e 's/XA/க்ஷா/g' -e 's/Xa/க்ஷ/g' -e 's/Xii/க்ஷீ/g' -e 's/Xi/க்ஷி/g' -e 's/XI/க்ஷீ/g' -e 's/Xuu/க்ஷூ/g' -e 's/Xu/க்ஷு/g' -e 's/XU/க்ஷூ/g' -e 's/Xee/க்ஷே/g' -e 's/Xe/க்ஷெ/g' -e 's/XE/க்ஷே/g' -e 's/Xoo/க்ஷோ/g' -e 's/Xo/க்ஷொ/g' -e 's/XO/க்ஷோ/g' -e 's/X/க்ஷ்/g' -e 's/njau/ஞௌ/g' -e 's/njai/ஞை/g' -e 's/njee/ஞே/g' -e 's/njoo/ஞோ/g' -e 's/njaa/ஞா/g' -e 's/njuu/ஞூ/g' -e 's/njii/ஞீ/g' -e 's/nja/ஞ/g' -e 's/nji/ஞி/g' -e 's/njI/ஞீ/g' -e 's/njA/ஞா/g' -e 's/nje/ஞெ/g' -e 's/njE/ஞே/g' -e 's/njo/ஞொ/g' -e 's/njO/ஞோ/g' -e 's/nju/ஞு/g' -e 's/njU/ஞூ/g' -e 's/nj/ஞ்/g' -e 's/ngau/ஙௌ/g' -e 's/ngai/ஙை/g' -e 's/ngee/ஙே/g' -e 's/ngoo/ஙோ/g' -e 's/ngaa/ஙா/g' -e 's/nguu/ஙூ/g' -e 's/ngii/ஙீ/g' -e 's/nga/ங/g' -e 's/ngi/ஙி/g' -e 's/ngI/ஙீ/g' -e 's/ngA/ஙா/g' -e 's/nge/ஙெ/g' -e 's/ngE/ஙே/g' -e 's/ngo/ஙொ/g' -e 's/ngO/ஙோ/g' -e 's/ngu/ஙு/g' -e 's/ngU/ஙூ/g' -e 's/ng/ங்/g' -e 's/shau/ஷௌ/g' -e 's/shai/ஷை/g' -e 's/shee/ஷே/g' -e 's/shoo/ஷோ/g' -e 's/shaa/ஷா/g' -e 's/shuu/ஷூ/g' -e 's/shii/ஷீ/g' -e 's/sha/ஷ/g' -e 's/shi/ஷி/g' -e 's/shI/ஷீ/g' -e 's/shA/ஷா/g' -e 's/she/ஷெ/g' -e 's/shE/ஷே/g' -e 's/sho/ஷொ/g' -e 's/shO/ஷோ/g' -e 's/shu/ஷு/g' -e 's/shU/ஷூ/g' -e 's/sh/ஷ்/g' -e 's/ nau/ நௌ/g' -e 's/ nai/ நை/g' -e 's/ nee/ நே/g' -e 's/ noo/ நோ/g' -e 's/ naa/ நா/g' -e 's/ nuu/ நூ/g' -e 's/ nii/ நீ/g' -e 's/ na/ ந/g' -e 's/ ni/ நி/g' -e 's/ nI/ நீ/g' -e 's/ nA/ நா/g' -e 's/ ne/ நெ/g' -e 's/ nE/ நே/g' -e 's/ no/ நொ/g' -e 's/ nO/ நோ/g' -e 's/ nu/ நு/g' -e 's/ nU/ நூ/g' -e 's/ nth/ ந்/g' -e 's/-nau/நௌ/g' -e 's/-nai/நை/g' -e 's/-nee/நே/g' -e 's/-noo/நோ/g' -e 's/-naa/நா/g' -e 's/-nuu/நூ/g' -e 's/-nii/நீ/g' -e 's/-na/ந/g' -e 's/-ni/நி/g' -e 's/-nI/நீ/g' -e 's/-nA/நா/g' -e 's/-ne/நெ/g' -e 's/-nE/நே/g' -e 's/-no/நொ/g' -e 's/-nO/நோ/g' -e 's/-nu/நு/g' -e 's/-nU/நூ/g' -e 's/n-au/நௌ/g' -e 's/n-ai/நை/g' -e 's/n -ee/நே/g' -e 's/n-oo/நோ/g' -e 's/n-aa/நா/g' -e 's/n-uu/நூ/g' -e 's/n-ii/நீ/g' -e 's/n-a/ந/g' -e 's/n-i/நி/g' -e 's/n-I/நீ/g' -e 's/n-A/நா/g' -e 's/n -e/நெ/g' -e 's/n -e/நே/g' -e 's/n-o/நொ/g' -e 's/n-O/நோ/g' -e 's/n-u/நு/g' -e 's/n-U/நூ/g' -e 's/wau/நௌ/g' -e 's/wai/நை/g' -e 's/wee/நே/g' -e 's/woo/நோ/g' -e 's/waa/நா/g' -e 's/wuu/நூ/g' -e 's/wii/நீ/g' -e 's/wa/ந/g' -e 's/wi/நி/g' -e 's/wI/நீ/g' -e 's/wA/நா/g' -e 's/we/நெ/g' -e 's/wE/நே/g' -e 's/wo/நொ/g' -e 's/wO/நோ/g' -e 's/wu/நு/g' -e 's/wU/நூ/g' -e 's/ n/ ந்/g' -e 's/n-/ந்/g' -e 's/-n/ந்/g' -e 's/w/ந்/g' -e 's/nthau/ந்தௌ/g' -e 's/nthai/ந்தை/g' -e 's/nthee/ந்தே/g' -e 's/nthoo/ந்தோ/g' -e 's/nthaa/ந்தா/g' -e 's/nthuu/ந்தூ/g' -e 's/nthii/ந்தீ/g' -e 's/ntha/ந்த/g' -e 's/nthi/ந்தி/g' -e 's/nthI/ந்தீ/g' -e 's/nthA/ந்தா/g' -e 's/nthe/ந்தெ/g' -e 's/nthE/ந்தே/g' -e 's/ntho/ந்தொ/g' -e 's/nthO/ந்தோ/g' -e 's/nthu/ந்து/g' -e 's/nthU/ந்தூ/g' -e 's/nth/ந்/g' -e 's/dhau/தௌ/g' -e 's/dhai/தை/g' -e 's/dhee/தே/g' -e 's/dhoo/தோ/g' -e 's/dhaa/தா/g' -e 's/dhuu/தூ/g' -e 's/dhii/தீ/g' -e 's/dha/த/g' -e 's/dhi/தி/g' -e 's/dhI/தீ/g' -e 's/dhA/தா/g' -e 's/dhe/தெ/g' -e 's/dhE/தே/g' -e 's/dho/தொ/g' -e 's/dhO/தோ/g' -e 's/dhu/து/g' -e 's/dhU/தூ/g' -e 's/dh/த்/g' -e 's/chau/சௌ/g' -e 's/chai/சை/g' -e 's/chee/சே/g' -e 's/choo/சோ/g' -e 's/chaa/சா/g' -e 's/chuu/சூ/g' -e 's/chii/சீ/g' -e 's/cha/ச/g' -e 's/chi/சி/g' -e 's/chI/சீ/g' -e 's/chA/சா/g' -e 's/che/செ/g' -e 's/chE/சே/g' -e 's/cho/சொ/g' -e 's/chO/சோ/g' -e 's/chu/சு/g' -e 's/chU/சூ/g' -e 's/ch/ச்/g' -e 's/zhau/ழௌ/g' -e 's/zhai/ழை/g' -e 's/zhee/ழே/g' -e 's/zhoo/ழோ/g' -e 's/zhaa/ழா/g' -e 's/zhuu/ழூ/g' -e 's/zhii/ழீ/g' -e 's/zha/ழ/g' -e 's/zhi/ழி/g' -e 's/zhI/ழீ/g' -e 's/zhA/ழா/g' -e 's/zhe/ழெ/g' -e 's/zhE/ழே/g' -e 's/zho/ழொ/g' -e 's/zhO/ழோ/g' -e 's/zhu/ழு/g' -e 's/zhU/ழூ/g' -e 's/zh/ழ்/g' -e 's/zau/ழௌ/g' -e 's/zai/ழை/g' -e 's/zee/ழே/g' -e 's/zoo/ழோ/g' -e 's/zaa/ழா/g' -e 's/zuu/ழூ/g' -e 's/zii/ழீ/g' -e 's/za/ழ/g' -e 's/zi/ழி/g' -e 's/zI/ழீ/g' -e 's/zA/ழா/g' -e 's/ze/ழெ/g' -e 's/zE/ழே/g' -e 's/zo/ழொ/g' -e 's/zO/ழோ/g' -e 's/zu/ழு/g' -e 's/zU/ழூ/g' -e 's/z/ழ்/g' -e 's/jau/ஜௌ/g' -e 's/jai/ஜை/g' -e 's/jee/ஜே/g' -e 's/joo/ஜோ/g' -e 's/jaa/ஜா/g' -e 's/juu/ஜூ/g' -e 's/jii/ஜீ/g' -e 's/ja/ஜ/g' -e 's/ji/ஜி/g' -e 's/jI/ஜீ/g' -e 's/jA/ஜா/g' -e 's/je/ஜெ/g' -e 's/jE/ஜே/g' -e 's/jo/ஜொ/g' -e 's/jO/ஜோ/g' -e 's/ju/ஜு/g' -e 's/jU/ஜூ/g' -e 's/j/ஜ்/g' -e 's/thau/தௌ/g' -e 's/thai/தை/g' -e 's/thee/தே/g' -e 's/thoo/தோ/g' -e 's/thaa/தா/g' -e 's/thuu/தூ/g' -e 's/thii/தீ/g' -e 's/tha/த/g' -e 's/thi/தி/g' -e 's/thI/தீ/g' -e 's/thA/தா/g' -e 's/the/தெ/g' -e 's/thE/தே/g' -e 's/tho/தொ/g' -e 's/thO/தோ/g' -e 's/thu/து/g' -e 's/thU/தூ/g' -e 's/th/த்/g' -e 's/-hau/ஹௌ/g' -e 's/-hai/ஹை/g' -e 's/-hee/ஹே/g' -e 's/-hoo/ஹோ/g' -e 's/-haa/ஹா/g' -e 's/-huu/ஹூ/g' -e 's/-hii/ஹீ/g' -e 's/-ha/ஹ/g' -e 's/-hi/ஹி/g' -e 's/-hI/ஹீ/g' -e 's/-hA/ஹா/g' -e 's/-he/ஹெ/g' -e 's/-hE/ஹே/g' -e 's/-ho/ஹொ/g' -e 's/-hO/ஹோ/g' -e 's/-hu/ஹு/g' -e 's/-hU/ஹூ/g' -e 's/-h/ஹ்/g' -e 's/hau/கௌ/g' -e 's/hai/கை/g' -e 's/hee/கே/g' -e 's/hoo/கோ/g' -e 's/haa/கா/g' -e 's/huu/கூ/g' -e 's/hii/கீ/g' -e 's/ha/க/g' -e 's/hi/கி/g' -e 's/hI/கீ/g' -e 's/hA/கா/g' -e 's/he/கெ/g' -e 's/hE/கே/g' -e 's/ho/கொ/g' -e 's/hO/கோ/g' -e 's/hu/கு/g' -e 's/hU/கூ/g' -e 's/h/க்/g' -e 's/kau/கௌ/g' -e 's/kai/கை/g' -e 's/kee/கே/g' -e 's/koo/கோ/g' -e 's/kaa/கா/g' -e 's/kuu/கூ/g' -e 's/kii/கீ/g' -e 's/ka/க/g' -e 's/ki/கி/g' -e 's/kI/கீ/g' -e 's/kA/கா/g' -e 's/ke/கெ/g' -e 's/kE/கே/g' -e 's/ko/கொ/g' -e 's/kO/கோ/g' -e 's/ku/கு/g' -e 's/kU/கூ/g' -e 's/k/க்/g' -e 's/-sau/ஸௌ/g' -e 's/-sai/ஸை/g' -e 's/-see/ஸே/g' -e 's/-soo/ஸோ/g' -e 's/-saa/ஸா/g' -e 's/-suu/ஸூ/g' -e 's/-sii/ஸீ/g' -e 's/-sa/ஸ/g' -e 's/-si/ஸி/g' -e 's/-sI/ஸீ/g' -e 's/-sA/ஸா/g' -e 's/-se/ஸெ/g' -e 's/-sE/ஸே/g' -e 's/-so/ஸொ/g' -e 's/-sO/ஸோ/g' -e 's/-su/ஸு/g' -e 's/-sU/ஸூ/g' -e 's/-s/ஸ்/g' -e 's/Sau/ஸௌ/g' -e 's/Sai/ஸை/g' -e 's/See/ஸே/g' -e 's/Soo/ஸோ/g' -e 's/Saa/ஸா/g' -e 's/Suu/ஸூ/g' -e 's/Sii/ஸீ/g' -e 's/Sa/ஸ/g' -e 's/Si/ஸி/g' -e 's/SI/ஸீ/g' -e 's/SA/ஸா/g' -e 's/Se/ஸெ/g' -e 's/SE/ஸே/g' -e 's/So/ஸொ/g' -e 's/SO/ஸோ/g' -e 's/Su/ஸு/g' -e 's/SU/ஸூ/g' -e 's/S/ஸ்/g' -e 's/rau/ரௌ/g' -e 's/rai/ரை/g' -e 's/ree/ரே/g' -e 's/roo/ரோ/g' -e 's/raa/ரா/g' -e 's/ruu/ரூ/g' -e 's/rii/ரீ/g' -e 's/ra/ர/g' -e 's/ri/ரி/g' -e 's/rI/ரீ/g' -e 's/rA/ரா/g' -e 's/re/ரெ/g' -e 's/rE/ரே/g' -e 's/ro/ரொ/g' -e 's/rO/ரோ/g' -e 's/ru/ரு/g' -e 's/rU/ரூ/g' -e 's/r/ர்/g' -e 's/Rau/றௌ/g' -e 's/Rai/றை/g' -e 's/Ree/றே/g' -e 's/Roo/றோ/g' -e 's/Raa/றா/g' -e 's/Ruu/றூ/g' -e 's/Rii/றீ/g' -e 's/Ra/ற/g' -e 's/Ri/றி/g' -e 's/RI/றீ/g' -e 's/RA/றா/g' -e 's/Re/றெ/g' -e 's/RE/றே/g' -e 's/Ro/றொ/g' -e 's/RO/றோ/g' -e 's/Ru/று/g' -e 's/RU/றூ/g' -e 's/R/ற்/g' -e 's/tau/டௌ/g' -e 's/tai/டை/g' -e 's/tee/டே/g' -e 's/too/டோ/g' -e 's/taa/டா/g' -e 's/tuu/டூ/g' -e 's/tii/டீ/g' -e 's/ta/ட/g' -e 's/ti/டி/g' -e 's/tI/டீ/g' -e 's/tA/டா/g' -e 's/te/டெ/g' -e 's/tE/டே/g' -e 's/to/டொ/g' -e 's/tO/டோ/g' -e 's/tu/டு/g' -e 's/tU/டூ/g' -e 's/t/ட்/g' -e 's/sau/சௌ/g' -e 's/sai/சை/g' -e 's/see/சே/g' -e 's/soo/சோ/g' -e 's/saa/சா/g' -e 's/suu/சூ/g' -e 's/sii/சீ/g' -e 's/sa/ச/g' -e 's/si/சி/g' -e 's/sI/சீ/g' -e 's/sA/சா/g' -e 's/se/செ/g' -e 's/sE/சே/g' -e 's/so/சொ/g' -e 's/sO/சோ/g' -e 's/su/சு/g' -e 's/sU/சூ/g' -e 's/s/ச்/g' -e 's/pau/பௌ/g' -e 's/pai/பை/g' -e 's/pee/பே/g' -e 's/poo/போ/g' -e 's/paa/பா/g' -e 's/puu/பூ/g' -e 's/pii/பீ/g' -e 's/pa/ப/g' -e 's/pi/பி/g' -e 's/pI/பீ/g' -e 's/pA/பா/g' -e 's/pe/பெ/g' -e 's/pE/பே/g' -e 's/po/பொ/g' -e 's/pO/போ/g' -e 's/pu/பு/g' -e 's/pU/பூ/g' -e 's/p/ப்/g' -e 's/bau/பௌ/g' -e 's/bai/பை/g' -e 's/bee/பே/g' -e 's/boo/போ/g' -e 's/baa/பா/g' -e 's/buu/பூ/g' -e 's/bii/பீ/g' -e 's/ba/ப/g' -e 's/bi/பி/g' -e 's/bI/பீ/g' -e 's/bA/பா/g' -e 's/be/பெ/g' -e 's/bE/பே/g' -e 's/bo/பொ/g' -e 's/bO/போ/g' -e 's/bu/பு/g' -e 's/bU/பூ/g' -e 's/b/ப்/g' -e 's/mau/மௌ/g' -e 's/mai/மை/g' -e 's/mee/மே/g' -e 's/moo/மோ/g' -e 's/maa/மா/g' -e 's/muu/மூ/g' -e 's/mii/மீ/g' -e 's/ma/ம/g' -e 's/mi/மி/g' -e 's/mI/மீ/g' -e 's/mA/மா/g' -e 's/me/மெ/g' -e 's/mE/மே/g' -e 's/mo/மொ/g' -e 's/mO/மோ/g' -e 's/mu/மு/g' -e 's/mU/மூ/g' -e 's/m/ம்/g' -e 's/yau/யௌ/g' -e 's/yai/யை/g' -e 's/yee/யே/g' -e 's/yoo/யோ/g' -e 's/yaa/யா/g' -e 's/yuu/யூ/g' -e 's/yii/யீ/g' -e 's/ya/ய/g' -e 's/yi/யி/g' -e 's/yI/யீ/g' -e 's/yA/யா/g' -e 's/ye/யெ/g' -e 's/yE/யே/g' -e 's/yo/யொ/g' -e 's/yO/யோ/g' -e 's/yu/யு/g' -e 's/yU/யூ/g' -e 's/y/ய்/g' -e 's/dau/டௌ/g' -e 's/dai/டை/g' -e 's/dee/டே/g' -e 's/doo/டோ/g' -e 's/daa/டா/g' -e 's/duu/டூ/g' -e 's/dii/டீ/g' -e 's/da/ட/g' -e 's/di/டி/g' -e 's/dI/டீ/g' -e 's/dA/டா/g' -e 's/de/டெ/g' -e 's/dE/டே/g' -e 's/do/டொ/g' -e 's/dO/டோ/g' -e 's/du/டு/g' -e 's/dU/டூ/g' -e 's/d/ட்/g' -e 's/nau/னௌ/g' -e 's/nai/னை/g' -e 's/nee/னே/g' -e 's/noo/னோ/g' -e 's/naa/னா/g' -e 's/nuu/னூ/g' -e 's/nii/னீ/g' -e 's/na/ன/g' -e 's/ni/னி/g' -e 's/nI/னீ/g' -e 's/nA/னா/g' -e 's/ne/னெ/g' -e 's/nE/னே/g' -e 's/no/னொ/g' -e 's/nO/னோ/g' -e 's/nu/னு/g' -e 's/nU/னூ/g' -e 's/n/ன்/g' -e 's/Nau/ணௌ/g' -e 's/Nai/ணை/g' -e 's/Nee/ணே/g' -e 's/Noo/ணோ/g' -e 's/Naa/ணா/g' -e 's/Nuu/ணூ/g' -e 's/Nii/ணீ/g' -e 's/Na/ண/g' -e 's/Ni/ணி/g' -e 's/NI/ணீ/g' -e 's/NA/ணா/g' -e 's/Ne/ணெ/g' -e 's/NE/ணே/g' -e 's/No/ணொ/g' -e 's/NO/ணோ/g' -e 's/Nu/ணு/g' -e 's/NU/ணூ/g' -e 's/N/ண்/g' -e 's/lau/லௌ/g' -e 's/lai/லை/g' -e 's/lee/லே/g' -e 's/loo/லோ/g' -e 's/laa/லா/g' -e 's/luu/லூ/g' -e 's/lii/லீ/g' -e 's/la/ல/g' -e 's/li/லி/g' -e 's/lI/லீ/g' -e 's/lA/லா/g' -e 's/le/லெ/g' -e 's/lE/லே/g' -e 's/lo/லொ/g' -e 's/lO/லோ/g' -e 's/lu/லு/g' -e 's/lU/லூ/g' -e 's/l/ல்/g' -e 's/Lau/ளௌ/g' -e 's/Lai/ளை/g' -e 's/Lee/ளே/g' -e 's/Loo/ளோ/g' -e 's/Laa/ளா/g' -e 's/Luu/ளூ/g' -e 's/Lii/ளீ/g' -e 's/La/ள/g' -e 's/Li/ளி/g' -e 's/LI/ளீ/g' -e 's/LA/ளா/g' -e 's/Le/ளெ/g' -e 's/LE/ளே/g' -e 's/Lo/ளொ/g' -e 's/LO/ளோ/g' -e 's/Lu/ளு/g' -e 's/LU/ளூ/g' -e 's/L/ள்/g' -e 's/vau/வௌ/g' -e 's/vai/வை/g' -e 's/vee/வே/g' -e 's/voo/வோ/g' -e 's/vaa/வா/g' -e 's/vuu/வூ/g' -e 's/vii/வீ/g' -e 's/va/வ/g' -e 's/vi/வி/g' -e 's/vI/வீ/g' -e 's/vA/வா/g' -e 's/ve/வெ/g' -e 's/vE/வே/g' -e 's/vo/வொ/g' -e 's/vO/வோ/g' -e 's/vu/வு/g' -e 's/vU/வூ/g' -e 's/v/வ்/g' -e 's/gau/கௌ/g' -e 's/gai/கை/g' -e 's/gee/கே/g' -e 's/goo/கோ/g' -e 's/gaa/கா/g' -e 's/guu/கூ/g' -e 's/gii/கீ/g' -e 's/ga/க/g' -e 's/gi/கி/g' -e 's/gI/கீ/g' -e 's/gA/கா/g' -e 's/ge/கெ/g' -e 's/gE/கே/g' -e 's/go/கொ/g' -e 's/gO/கோ/g' -e 's/gu/கு/g' -e 's/gU/கூ/g' -e 's/g/க்/g' -e 's/au/ஔ/g' -e 's/ai/ஐ/g' -e 's/aa/ஆ/g' -e 's/ee/ஏ/g' -e 's/ii/ஈ/g' -e 's/uu/ஊ/g' -e 's/oo/ஓ/g' -e 's/-1000/௲/g' -e 's/-100/௱/g' -e 's/-10/௰/g' -e 's/-1/௧/g' -e 's/-2/௨/g' -e 's/-3/௩/g' -e 's/-4/௪/g' -e 's/-5/௫/g' -e 's/-6/௬/g' -e 's/-7/௭/g' -e 's/-8/௮/g' -e 's/-9/௯/g' -e 's/i/இ/g' -e 's/I/ஈ/g' -e 's/a/அ/g' -e 's/A/ஆ/g' -e 's/e/எ/g' -e 's/E/ஏ/g' -e 's/i/இ/g' -e 's/I/ஈ/g' -e 's/u/உ/g' -e 's/U/ஊ/g' -e 's/o/ஒ/g' -e 's/O/ஓ/g' -e 's/q/ஃ/g' < $filename > $filename-converted.html
07-12-2007, 03:50 PM
#7
Member
Registered: Jul 2004
Location: Rio de Janeiro - Brazil
Distribution: Conectiva 10 - Conectiva 8 - Slackware 9 - starting with LFS
Posts: 519
Rep:
Hi mauran,
I guess you should look for "iconv", which is the tool to translate from one character set to another.
07-12-2007, 04:03 PM
#8
LQ Veteran
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809
Quote:
Originally Posted by mauran
I'm directly input Tamil charecters into bash script using scim input method.
I've almost finished my code.
now I'm redirecting the output to a html file.
html file can be handled easily in these encoding stuff.
here is my code
Good Grief!!!
I am tempted to tell you that I spotted an error on line 76, but I think you would know better.
Actually, that printout might make a neat desktop background.....
07-12-2007, 11:18 PM
#9
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
Quote:
Originally Posted by osvaldomarques
Hi mauran,
I guess you should look for "iconv", which is the tool to translate from one character set to another.
Thanks!!
That worked.
now no need to redirect to html. :-)
07-12-2007, 11:25 PM
#10
LQ Guru
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
Change the first line to "!#/bin/bash"
07-12-2007, 11:27 PM
#11
LQ Guru
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
If you are using gnu sed, you can use the form:
Code:
sed 's/Xau/க்ஷௌ/g;s/Xai/க்ஷை/g;s/Xaa/க்ஷா/g'
which is the same as
sed -e 's/Xau/க்ஷௌ/g' -e 's/Xai/க்ஷை/g' -e 's/Xaa/க்ஷா/g'
but, for such a long sed script, you might want to produce a sed script that you use as an argument to the -f option.
Last edited by jschiwal; 07-12-2007 at 11:28 PM .
07-12-2007, 11:37 PM
#12
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
Quote:
Originally Posted by jschiwal
Change the first line to "!#/bin/bash"
Can I know the reason for this?
#/bin/bash! is working for me.
and.
thank you for the short form.
07-12-2007, 11:56 PM
#13
LQ Veteran
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809
Actually, the books say: "#!/bin/bash"
But your version works also on my machine. However, my script also works if the line is completely deleted. Obviously, bash is the default.
Note that "#/bin/bash!" is likely just being seen as a comment.
07-12-2007, 11:59 PM
#14
LQ Guru
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
!# are two magic characters that the kernel looks for. If they are present, the rest of the line is taken as the shell to run.
#/bin/bash! is just plain wrong. Your script may work only because /bin/bash is already your default shell. Someone running your shell using ksh or csh would not be a lucky, unless the rest of your script would be work in both shells.
07-13-2007, 02:46 AM
#15
LQ Newbie
Registered: Dec 2005
Location: Sri Lanka
Distribution: unbuntu 7.04
Posts: 17
Original Poster
Rep:
Quote:
Originally Posted by jschiwal
Change the first line to "!#/bin/bash"
It's gives this error
Quote:
./roman.sh: line 1: !#/bin/bash: No such file or directory
:-(
All times are GMT -5. The time now is 06:05 PM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News