LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to remove accent characters (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-remove-accent-characters-4175431373/)

kitonia 10-09-2012 01:57 PM

How to remove accent characters
 
Does anyone know the command remove the accent characters? For example, I would like the name Renée to be Renee

colucix 10-09-2012 03:41 PM

Accented characters are not easy to manage since they are encoded in UTF-8 and they are 2-bytes in size: using octal or hexadecimal codes to match them may become a mess. On the other hand, if you can type them in the command line, you can always translate them literally, e.g.
Code:

echo Renée | sed 's/é/e/'
but if you want to do this for every accented character, umlaut, caron, etc. etc. in text files, you need to write down a sequence of long character lists, e.g.
Code:

sed -e 's/[èéêë]/e/' -e 's/[àáâãäå]/a/' ...
A more convenient way is by means of the iconv command to change the character encoding from UTF-8 to ASCII and transliterate the special characters so that é become e, ò becomes o and so on. Example:
Code:

echo Renée | iconv -f UTF-8 -t ASCII//TRANSLIT
or if you want to change the content of a file:
Code:

iconv -f UTF-8 -t ASCII//TRANSLIT infile > oufile
Hope this helps.

arbex5 06-07-2013 04:43 PM

There is a simple way
 
Just use the command unaccent.

$ unaccent ISO-8859-1 < myfile > myfile.unaccent

Worked like a charm to me.


All times are GMT -5. The time now is 04:26 PM.