LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   multiple character replacement by shell script (http://www.linuxquestions.org/questions/programming-9/multiple-character-replacement-by-shell-script-568708/)

mauran 07-12-2007 08:02 AM

multiple character replacement by shell script
 
hello,

I want to write a shell script to convert text file which contains phonetically written Tamil text into Tamil Unicode text file.

It is just like this.

first, need to search all patterns

say, 'aa'

then replace that pattern intu unicode charecter "அ"

please help me

pixellany 07-12-2007 08:38 AM

Have you read up on any of the standard utilities---eg sed and awk?

One common solution is the sed "substitute" command. Suppose you wanted to replace all instances of "aa" with "bb":
sed 's/aa/bb/g' <oldfile >newfile

I am sure there is a way to put in a hex byte instead of "bb", but I don't have it in front of me.

Really good sed, awk, and other tutorials here:
http://www.grymoire.com/Unix/

mauran 07-12-2007 02:31 PM

pixellany,

Thank you very much.

I got the idea from your reply and made a small script to achieve a demo conversion.

here is my script.

Quote:

sed -e 's/ma/ம/g' -e 's/yuu/யூ/g' -e 's/ra/ர/g' -e 's/n/ன்/g' < mauran >mauran1
but it gave me a new problem.

output file contains some complicated utf-8 mess like this,

Quote:

மuரன் மuரன் மரm மரன்am
does > operator handles Unicode well?

mauran 07-12-2007 02:35 PM

update!
 
i saved the output file as file.html and open it in Firefox. when I make firefox's encoding as utf-8, it shows characters correctly.

so the problem is,

how to handle utf-8 encoding in sed and >.

pixellany 07-12-2007 03:15 PM

The "<" ">" operators are for redirection and do not care what the data encoding is.

How did you get the special characters into the sed commands? You may need to run some experiments to see which characters get correctly handled by sed. (I've never seen anything about this in the various books on sed.)

Did you find anything on how to input raw hex bytes using sed?

mauran 07-12-2007 03:45 PM

I'm directly input Tamil charecters into bash script using scim input method.

I've almost finished my code.

now I'm redirecting the output to a html file.
html file can be handled easily in these encoding stuff.

here is my code

Quote:

#/bin/bash!

filename=`zenity --file-selection`

sed -e 's/Xau/க்ஷௌ/g' -e 's/Xai/க்ஷை/g' -e 's/Xaa/க்ஷா/g' -e 's/XA/க்ஷா/g' -e 's/Xa/க்ஷ/g' -e 's/Xii/க்ஷீ/g' -e 's/Xi/க்ஷி/g' -e 's/XI/க்ஷீ/g' -e 's/Xuu/க்ஷூ/g' -e 's/Xu/க்ஷு/g' -e 's/XU/க்ஷூ/g' -e 's/Xee/க்ஷே/g' -e 's/Xe/க்ஷெ/g' -e 's/XE/க்ஷே/g' -e 's/Xoo/க்ஷோ/g' -e 's/Xo/க்ஷொ/g' -e 's/XO/க்ஷோ/g' -e 's/X/க்ஷ்/g' -e 's/njau/ஞௌ/g' -e 's/njai/ஞை/g' -e 's/njee/ஞே/g' -e 's/njoo/ஞோ/g' -e 's/njaa/ஞா/g' -e 's/njuu/ஞூ/g' -e 's/njii/ஞீ/g' -e 's/nja/ஞ/g' -e 's/nji/ஞி/g' -e 's/njI/ஞீ/g' -e 's/njA/ஞா/g' -e 's/nje/ஞெ/g' -e 's/njE/ஞே/g' -e 's/njo/ஞொ/g' -e 's/njO/ஞோ/g' -e 's/nju/ஞு/g' -e 's/njU/ஞூ/g' -e 's/nj/ஞ்/g' -e 's/ngau/ஙௌ/g' -e 's/ngai/ஙை/g' -e 's/ngee/ஙே/g' -e 's/ngoo/ஙோ/g' -e 's/ngaa/ஙா/g' -e 's/nguu/ஙூ/g' -e 's/ngii/ஙீ/g' -e 's/nga/ங/g' -e 's/ngi/ஙி/g' -e 's/ngI/ஙீ/g' -e 's/ngA/ஙா/g' -e 's/nge/ஙெ/g' -e 's/ngE/ஙே/g' -e 's/ngo/ஙொ/g' -e 's/ngO/ஙோ/g' -e 's/ngu/ஙு/g' -e 's/ngU/ஙூ/g' -e 's/ng/ங்/g' -e 's/shau/ஷௌ/g' -e 's/shai/ஷை/g' -e 's/shee/ஷே/g' -e 's/shoo/ஷோ/g' -e 's/shaa/ஷா/g' -e 's/shuu/ஷூ/g' -e 's/shii/ஷீ/g' -e 's/sha/ஷ/g' -e 's/shi/ஷி/g' -e 's/shI/ஷீ/g' -e 's/shA/ஷா/g' -e 's/she/ஷெ/g' -e 's/shE/ஷே/g' -e 's/sho/ஷொ/g' -e 's/shO/ஷோ/g' -e 's/shu/ஷு/g' -e 's/shU/ஷூ/g' -e 's/sh/ஷ்/g' -e 's/ nau/ நௌ/g' -e 's/ nai/ நை/g' -e 's/ nee/ நே/g' -e 's/ noo/ நோ/g' -e 's/ naa/ நா/g' -e 's/ nuu/ நூ/g' -e 's/ nii/ நீ/g' -e 's/ na/ ந/g' -e 's/ ni/ நி/g' -e 's/ nI/ நீ/g' -e 's/ nA/ நா/g' -e 's/ ne/ நெ/g' -e 's/ nE/ நே/g' -e 's/ no/ நொ/g' -e 's/ nO/ நோ/g' -e 's/ nu/ நு/g' -e 's/ nU/ நூ/g' -e 's/ nth/ ந்/g' -e 's/-nau/நௌ/g' -e 's/-nai/நை/g' -e 's/-nee/நே/g' -e 's/-noo/நோ/g' -e 's/-naa/நா/g' -e 's/-nuu/நூ/g' -e 's/-nii/நீ/g' -e 's/-na/ந/g' -e 's/-ni/நி/g' -e 's/-nI/நீ/g' -e 's/-nA/நா/g' -e 's/-ne/நெ/g' -e 's/-nE/நே/g' -e 's/-no/நொ/g' -e 's/-nO/நோ/g' -e 's/-nu/நு/g' -e 's/-nU/நூ/g' -e 's/n-au/நௌ/g' -e 's/n-ai/நை/g' -e 's/n -ee/நே/g' -e 's/n-oo/நோ/g' -e 's/n-aa/நா/g' -e 's/n-uu/நூ/g' -e 's/n-ii/நீ/g' -e 's/n-a/ந/g' -e 's/n-i/நி/g' -e 's/n-I/நீ/g' -e 's/n-A/நா/g' -e 's/n -e/நெ/g' -e 's/n -e/நே/g' -e 's/n-o/நொ/g' -e 's/n-O/நோ/g' -e 's/n-u/நு/g' -e 's/n-U/நூ/g' -e 's/wau/நௌ/g' -e 's/wai/நை/g' -e 's/wee/நே/g' -e 's/woo/நோ/g' -e 's/waa/நா/g' -e 's/wuu/நூ/g' -e 's/wii/நீ/g' -e 's/wa/ந/g' -e 's/wi/நி/g' -e 's/wI/நீ/g' -e 's/wA/நா/g' -e 's/we/நெ/g' -e 's/wE/நே/g' -e 's/wo/நொ/g' -e 's/wO/நோ/g' -e 's/wu/நு/g' -e 's/wU/நூ/g' -e 's/ n/ ந்/g' -e 's/n-/ந்/g' -e 's/-n/ந்/g' -e 's/w/ந்/g' -e 's/nthau/ந்தௌ/g' -e 's/nthai/ந்தை/g' -e 's/nthee/ந்தே/g' -e 's/nthoo/ந்தோ/g' -e 's/nthaa/ந்தா/g' -e 's/nthuu/ந்தூ/g' -e 's/nthii/ந்தீ/g' -e 's/ntha/ந்த/g' -e 's/nthi/ந்தி/g' -e 's/nthI/ந்தீ/g' -e 's/nthA/ந்தா/g' -e 's/nthe/ந்தெ/g' -e 's/nthE/ந்தே/g' -e 's/ntho/ந்தொ/g' -e 's/nthO/ந்தோ/g' -e 's/nthu/ந்து/g' -e 's/nthU/ந்தூ/g' -e 's/nth/ந்/g' -e 's/dhau/தௌ/g' -e 's/dhai/தை/g' -e 's/dhee/தே/g' -e 's/dhoo/தோ/g' -e 's/dhaa/தா/g' -e 's/dhuu/தூ/g' -e 's/dhii/தீ/g' -e 's/dha/த/g' -e 's/dhi/தி/g' -e 's/dhI/தீ/g' -e 's/dhA/தா/g' -e 's/dhe/தெ/g' -e 's/dhE/தே/g' -e 's/dho/தொ/g' -e 's/dhO/தோ/g' -e 's/dhu/து/g' -e 's/dhU/தூ/g' -e 's/dh/த்/g' -e 's/chau/சௌ/g' -e 's/chai/சை/g' -e 's/chee/சே/g' -e 's/choo/சோ/g' -e 's/chaa/சா/g' -e 's/chuu/சூ/g' -e 's/chii/சீ/g' -e 's/cha/ச/g' -e 's/chi/சி/g' -e 's/chI/சீ/g' -e 's/chA/சா/g' -e 's/che/செ/g' -e 's/chE/சே/g' -e 's/cho/சொ/g' -e 's/chO/சோ/g' -e 's/chu/சு/g' -e 's/chU/சூ/g' -e 's/ch/ச்/g' -e 's/zhau/ழௌ/g' -e 's/zhai/ழை/g' -e 's/zhee/ழே/g' -e 's/zhoo/ழோ/g' -e 's/zhaa/ழா/g' -e 's/zhuu/ழூ/g' -e 's/zhii/ழீ/g' -e 's/zha/ழ/g' -e 's/zhi/ழி/g' -e 's/zhI/ழீ/g' -e 's/zhA/ழா/g' -e 's/zhe/ழெ/g' -e 's/zhE/ழே/g' -e 's/zho/ழொ/g' -e 's/zhO/ழோ/g' -e 's/zhu/ழு/g' -e 's/zhU/ழூ/g' -e 's/zh/ழ்/g' -e 's/zau/ழௌ/g' -e 's/zai/ழை/g' -e 's/zee/ழே/g' -e 's/zoo/ழோ/g' -e 's/zaa/ழா/g' -e 's/zuu/ழூ/g' -e 's/zii/ழீ/g' -e 's/za/ழ/g' -e 's/zi/ழி/g' -e 's/zI/ழீ/g' -e 's/zA/ழா/g' -e 's/ze/ழெ/g' -e 's/zE/ழே/g' -e 's/zo/ழொ/g' -e 's/zO/ழோ/g' -e 's/zu/ழு/g' -e 's/zU/ழூ/g' -e 's/z/ழ்/g' -e 's/jau/ஜௌ/g' -e 's/jai/ஜை/g' -e 's/jee/ஜே/g' -e 's/joo/ஜோ/g' -e 's/jaa/ஜா/g' -e 's/juu/ஜூ/g' -e 's/jii/ஜீ/g' -e 's/ja/ஜ/g' -e 's/ji/ஜி/g' -e 's/jI/ஜீ/g' -e 's/jA/ஜா/g' -e 's/je/ஜெ/g' -e 's/jE/ஜே/g' -e 's/jo/ஜொ/g' -e 's/jO/ஜோ/g' -e 's/ju/ஜு/g' -e 's/jU/ஜூ/g' -e 's/j/ஜ்/g' -e 's/thau/தௌ/g' -e 's/thai/தை/g' -e 's/thee/தே/g' -e 's/thoo/தோ/g' -e 's/thaa/தா/g' -e 's/thuu/தூ/g' -e 's/thii/தீ/g' -e 's/tha/த/g' -e 's/thi/தி/g' -e 's/thI/தீ/g' -e 's/thA/தா/g' -e 's/the/தெ/g' -e 's/thE/தே/g' -e 's/tho/தொ/g' -e 's/thO/தோ/g' -e 's/thu/து/g' -e 's/thU/தூ/g' -e 's/th/த்/g' -e 's/-hau/ஹௌ/g' -e 's/-hai/ஹை/g' -e 's/-hee/ஹே/g' -e 's/-hoo/ஹோ/g' -e 's/-haa/ஹா/g' -e 's/-huu/ஹூ/g' -e 's/-hii/ஹீ/g' -e 's/-ha/ஹ/g' -e 's/-hi/ஹி/g' -e 's/-hI/ஹீ/g' -e 's/-hA/ஹா/g' -e 's/-he/ஹெ/g' -e 's/-hE/ஹே/g' -e 's/-ho/ஹொ/g' -e 's/-hO/ஹோ/g' -e 's/-hu/ஹு/g' -e 's/-hU/ஹூ/g' -e 's/-h/ஹ்/g' -e 's/hau/கௌ/g' -e 's/hai/கை/g' -e 's/hee/கே/g' -e 's/hoo/கோ/g' -e 's/haa/கா/g' -e 's/huu/கூ/g' -e 's/hii/கீ/g' -e 's/ha/க/g' -e 's/hi/கி/g' -e 's/hI/கீ/g' -e 's/hA/கா/g' -e 's/he/கெ/g' -e 's/hE/கே/g' -e 's/ho/கொ/g' -e 's/hO/கோ/g' -e 's/hu/கு/g' -e 's/hU/கூ/g' -e 's/h/க்/g' -e 's/kau/கௌ/g' -e 's/kai/கை/g' -e 's/kee/கே/g' -e 's/koo/கோ/g' -e 's/kaa/கா/g' -e 's/kuu/கூ/g' -e 's/kii/கீ/g' -e 's/ka/க/g' -e 's/ki/கி/g' -e 's/kI/கீ/g' -e 's/kA/கா/g' -e 's/ke/கெ/g' -e 's/kE/கே/g' -e 's/ko/கொ/g' -e 's/kO/கோ/g' -e 's/ku/கு/g' -e 's/kU/கூ/g' -e 's/k/க்/g' -e 's/-sau/ஸௌ/g' -e 's/-sai/ஸை/g' -e 's/-see/ஸே/g' -e 's/-soo/ஸோ/g' -e 's/-saa/ஸா/g' -e 's/-suu/ஸூ/g' -e 's/-sii/ஸீ/g' -e 's/-sa/ஸ/g' -e 's/-si/ஸி/g' -e 's/-sI/ஸீ/g' -e 's/-sA/ஸா/g' -e 's/-se/ஸெ/g' -e 's/-sE/ஸே/g' -e 's/-so/ஸொ/g' -e 's/-sO/ஸோ/g' -e 's/-su/ஸு/g' -e 's/-sU/ஸூ/g' -e 's/-s/ஸ்/g' -e 's/Sau/ஸௌ/g' -e 's/Sai/ஸை/g' -e 's/See/ஸே/g' -e 's/Soo/ஸோ/g' -e 's/Saa/ஸா/g' -e 's/Suu/ஸூ/g' -e 's/Sii/ஸீ/g' -e 's/Sa/ஸ/g' -e 's/Si/ஸி/g' -e 's/SI/ஸீ/g' -e 's/SA/ஸா/g' -e 's/Se/ஸெ/g' -e 's/SE/ஸே/g' -e 's/So/ஸொ/g' -e 's/SO/ஸோ/g' -e 's/Su/ஸு/g' -e 's/SU/ஸூ/g' -e 's/S/ஸ்/g' -e 's/rau/ரௌ/g' -e 's/rai/ரை/g' -e 's/ree/ரே/g' -e 's/roo/ரோ/g' -e 's/raa/ரா/g' -e 's/ruu/ரூ/g' -e 's/rii/ரீ/g' -e 's/ra/ர/g' -e 's/ri/ரி/g' -e 's/rI/ரீ/g' -e 's/rA/ரா/g' -e 's/re/ரெ/g' -e 's/rE/ரே/g' -e 's/ro/ரொ/g' -e 's/rO/ரோ/g' -e 's/ru/ரு/g' -e 's/rU/ரூ/g' -e 's/r/ர்/g' -e 's/Rau/றௌ/g' -e 's/Rai/றை/g' -e 's/Ree/றே/g' -e 's/Roo/றோ/g' -e 's/Raa/றா/g' -e 's/Ruu/றூ/g' -e 's/Rii/றீ/g' -e 's/Ra/ற/g' -e 's/Ri/றி/g' -e 's/RI/றீ/g' -e 's/RA/றா/g' -e 's/Re/றெ/g' -e 's/RE/றே/g' -e 's/Ro/றொ/g' -e 's/RO/றோ/g' -e 's/Ru/று/g' -e 's/RU/றூ/g' -e 's/R/ற்/g' -e 's/tau/டௌ/g' -e 's/tai/டை/g' -e 's/tee/டே/g' -e 's/too/டோ/g' -e 's/taa/டா/g' -e 's/tuu/டூ/g' -e 's/tii/டீ/g' -e 's/ta/ட/g' -e 's/ti/டி/g' -e 's/tI/டீ/g' -e 's/tA/டா/g' -e 's/te/டெ/g' -e 's/tE/டே/g' -e 's/to/டொ/g' -e 's/tO/டோ/g' -e 's/tu/டு/g' -e 's/tU/டூ/g' -e 's/t/ட்/g' -e 's/sau/சௌ/g' -e 's/sai/சை/g' -e 's/see/சே/g' -e 's/soo/சோ/g' -e 's/saa/சா/g' -e 's/suu/சூ/g' -e 's/sii/சீ/g' -e 's/sa/ச/g' -e 's/si/சி/g' -e 's/sI/சீ/g' -e 's/sA/சா/g' -e 's/se/செ/g' -e 's/sE/சே/g' -e 's/so/சொ/g' -e 's/sO/சோ/g' -e 's/su/சு/g' -e 's/sU/சூ/g' -e 's/s/ச்/g' -e 's/pau/பௌ/g' -e 's/pai/பை/g' -e 's/pee/பே/g' -e 's/poo/போ/g' -e 's/paa/பா/g' -e 's/puu/பூ/g' -e 's/pii/பீ/g' -e 's/pa/ப/g' -e 's/pi/பி/g' -e 's/pI/பீ/g' -e 's/pA/பா/g' -e 's/pe/பெ/g' -e 's/pE/பே/g' -e 's/po/பொ/g' -e 's/pO/போ/g' -e 's/pu/பு/g' -e 's/pU/பூ/g' -e 's/p/ப்/g' -e 's/bau/பௌ/g' -e 's/bai/பை/g' -e 's/bee/பே/g' -e 's/boo/போ/g' -e 's/baa/பா/g' -e 's/buu/பூ/g' -e 's/bii/பீ/g' -e 's/ba/ப/g' -e 's/bi/பி/g' -e 's/bI/பீ/g' -e 's/bA/பா/g' -e 's/be/பெ/g' -e 's/bE/பே/g' -e 's/bo/பொ/g' -e 's/bO/போ/g' -e 's/bu/பு/g' -e 's/bU/பூ/g' -e 's/b/ப்/g' -e 's/mau/மௌ/g' -e 's/mai/மை/g' -e 's/mee/மே/g' -e 's/moo/மோ/g' -e 's/maa/மா/g' -e 's/muu/மூ/g' -e 's/mii/மீ/g' -e 's/ma/ம/g' -e 's/mi/மி/g' -e 's/mI/மீ/g' -e 's/mA/மா/g' -e 's/me/மெ/g' -e 's/mE/மே/g' -e 's/mo/மொ/g' -e 's/mO/மோ/g' -e 's/mu/மு/g' -e 's/mU/மூ/g' -e 's/m/ம்/g' -e 's/yau/யௌ/g' -e 's/yai/யை/g' -e 's/yee/யே/g' -e 's/yoo/யோ/g' -e 's/yaa/யா/g' -e 's/yuu/யூ/g' -e 's/yii/யீ/g' -e 's/ya/ய/g' -e 's/yi/யி/g' -e 's/yI/யீ/g' -e 's/yA/யா/g' -e 's/ye/யெ/g' -e 's/yE/யே/g' -e 's/yo/யொ/g' -e 's/yO/யோ/g' -e 's/yu/யு/g' -e 's/yU/யூ/g' -e 's/y/ய்/g' -e 's/dau/டௌ/g' -e 's/dai/டை/g' -e 's/dee/டே/g' -e 's/doo/டோ/g' -e 's/daa/டா/g' -e 's/duu/டூ/g' -e 's/dii/டீ/g' -e 's/da/ட/g' -e 's/di/டி/g' -e 's/dI/டீ/g' -e 's/dA/டா/g' -e 's/de/டெ/g' -e 's/dE/டே/g' -e 's/do/டொ/g' -e 's/dO/டோ/g' -e 's/du/டு/g' -e 's/dU/டூ/g' -e 's/d/ட்/g' -e 's/nau/னௌ/g' -e 's/nai/னை/g' -e 's/nee/னே/g' -e 's/noo/னோ/g' -e 's/naa/னா/g' -e 's/nuu/னூ/g' -e 's/nii/னீ/g' -e 's/na/ன/g' -e 's/ni/னி/g' -e 's/nI/னீ/g' -e 's/nA/னா/g' -e 's/ne/னெ/g' -e 's/nE/னே/g' -e 's/no/னொ/g' -e 's/nO/னோ/g' -e 's/nu/னு/g' -e 's/nU/னூ/g' -e 's/n/ன்/g' -e 's/Nau/ணௌ/g' -e 's/Nai/ணை/g' -e 's/Nee/ணே/g' -e 's/Noo/ணோ/g' -e 's/Naa/ணா/g' -e 's/Nuu/ணூ/g' -e 's/Nii/ணீ/g' -e 's/Na/ண/g' -e 's/Ni/ணி/g' -e 's/NI/ணீ/g' -e 's/NA/ணா/g' -e 's/Ne/ணெ/g' -e 's/NE/ணே/g' -e 's/No/ணொ/g' -e 's/NO/ணோ/g' -e 's/Nu/ணு/g' -e 's/NU/ணூ/g' -e 's/N/ண்/g' -e 's/lau/லௌ/g' -e 's/lai/லை/g' -e 's/lee/லே/g' -e 's/loo/லோ/g' -e 's/laa/லா/g' -e 's/luu/லூ/g' -e 's/lii/லீ/g' -e 's/la/ல/g' -e 's/li/லி/g' -e 's/lI/லீ/g' -e 's/lA/லா/g' -e 's/le/லெ/g' -e 's/lE/லே/g' -e 's/lo/லொ/g' -e 's/lO/லோ/g' -e 's/lu/லு/g' -e 's/lU/லூ/g' -e 's/l/ல்/g' -e 's/Lau/ளௌ/g' -e 's/Lai/ளை/g' -e 's/Lee/ளே/g' -e 's/Loo/ளோ/g' -e 's/Laa/ளா/g' -e 's/Luu/ளூ/g' -e 's/Lii/ளீ/g' -e 's/La/ள/g' -e 's/Li/ளி/g' -e 's/LI/ளீ/g' -e 's/LA/ளா/g' -e 's/Le/ளெ/g' -e 's/LE/ளே/g' -e 's/Lo/ளொ/g' -e 's/LO/ளோ/g' -e 's/Lu/ளு/g' -e 's/LU/ளூ/g' -e 's/L/ள்/g' -e 's/vau/வௌ/g' -e 's/vai/வை/g' -e 's/vee/வே/g' -e 's/voo/வோ/g' -e 's/vaa/வா/g' -e 's/vuu/வூ/g' -e 's/vii/வீ/g' -e 's/va/வ/g' -e 's/vi/வி/g' -e 's/vI/வீ/g' -e 's/vA/வா/g' -e 's/ve/வெ/g' -e 's/vE/வே/g' -e 's/vo/வொ/g' -e 's/vO/வோ/g' -e 's/vu/வு/g' -e 's/vU/வூ/g' -e 's/v/வ்/g' -e 's/gau/கௌ/g' -e 's/gai/கை/g' -e 's/gee/கே/g' -e 's/goo/கோ/g' -e 's/gaa/கா/g' -e 's/guu/கூ/g' -e 's/gii/கீ/g' -e 's/ga/க/g' -e 's/gi/கி/g' -e 's/gI/கீ/g' -e 's/gA/கா/g' -e 's/ge/கெ/g' -e 's/gE/கே/g' -e 's/go/கொ/g' -e 's/gO/கோ/g' -e 's/gu/கு/g' -e 's/gU/கூ/g' -e 's/g/க்/g' -e 's/au/ஔ/g' -e 's/ai/ஐ/g' -e 's/aa/ஆ/g' -e 's/ee/ஏ/g' -e 's/ii/ஈ/g' -e 's/uu/ஊ/g' -e 's/oo/ஓ/g' -e 's/-1000/௲/g' -e 's/-100/௱/g' -e 's/-10/௰/g' -e 's/-1/௧/g' -e 's/-2/௨/g' -e 's/-3/௩/g' -e 's/-4/௪/g' -e 's/-5/௫/g' -e 's/-6/௬/g' -e 's/-7/௭/g' -e 's/-8/௮/g' -e 's/-9/௯/g' -e 's/i/இ/g' -e 's/I/ஈ/g' -e 's/a/அ/g' -e 's/A/ஆ/g' -e 's/e/எ/g' -e 's/E/ஏ/g' -e 's/i/இ/g' -e 's/I/ஈ/g' -e 's/u/உ/g' -e 's/U/ஊ/g' -e 's/o/ஒ/g' -e 's/O/ஓ/g' -e 's/q/ஃ/g' < $filename > $filename-converted.html

osvaldomarques 07-12-2007 04:50 PM

Hi mauran,

I guess you should look for "iconv", which is the tool to translate from one character set to another.

pixellany 07-12-2007 05:03 PM

Quote:

Originally Posted by mauran
I'm directly input Tamil charecters into bash script using scim input method.

I've almost finished my code.

now I'm redirecting the output to a html file.
html file can be handled easily in these encoding stuff.

here is my code

Good Grief!!!
I am tempted to tell you that I spotted an error on line 76, but I think you would know better.

Actually, that printout might make a neat desktop background.....;)

mauran 07-13-2007 12:18 AM

Quote:

Originally Posted by osvaldomarques
Hi mauran,

I guess you should look for "iconv", which is the tool to translate from one character set to another.


Thanks!!

That worked.

now no need to redirect to html. :-)

jschiwal 07-13-2007 12:25 AM

Quote:

#/bin/bash!
Change the first line to "!#/bin/bash"

jschiwal 07-13-2007 12:27 AM

If you are using gnu sed, you can use the form:
Code:

sed 's/Xau/க்ஷௌ/g;s/Xai/க்ஷை/g;s/Xaa/க்ஷா/g'

which is the same as

sed -e 's/Xau/க்ஷௌ/g' -e 's/Xai/க்ஷை/g' -e 's/Xaa/க்ஷா/g'

but, for such a long sed script, you might want to produce a sed script that you use as an argument to the -f option.

mauran 07-13-2007 12:37 AM

Quote:

Originally Posted by jschiwal
Change the first line to "!#/bin/bash"

Can I know the reason for this?

#/bin/bash! is working for me.

and.

thank you for the short form.

pixellany 07-13-2007 12:56 AM

Actually, the books say: "#!/bin/bash"

But your version works also on my machine. However, my script also works if the line is completely deleted. Obviously, bash is the default.

Note that "#/bin/bash!" is likely just being seen as a comment.

jschiwal 07-13-2007 12:59 AM

!# are two magic characters that the kernel looks for. If they are present, the rest of the line is taken as the shell to run.
#/bin/bash! is just plain wrong. Your script may work only because /bin/bash is already your default shell. Someone running your shell using ksh or csh would not be a lucky, unless the rest of your script would be work in both shells.

mauran 07-13-2007 03:46 AM

Quote:

Originally Posted by jschiwal
Change the first line to "!#/bin/bash"


It's gives this error

Quote:

./roman.sh: line 1: !#/bin/bash: No such file or directory
:-(


All times are GMT -5. The time now is 03:29 PM.