LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   python: how to handle unicode chars in ascii strings? (https://www.linuxquestions.org/questions/programming-9/python-how-to-handle-unicode-chars-in-ascii-strings-700041/)

BrianK 01-26-2009 05:29 PM

python: how to handle unicode chars in ascii strings?
 
So, in python, this assignment is legal, but it breaks conversion:

Code:

In [44]: sys.getdefaultencoding()
Out[44]: 'ascii'

In [45]: a = 'André'

In [46]: a.encode('ascii','replace')
---------------------------------------------------------------------------
exceptions.UnicodeDecodeError                        Traceback (most recent call last)

/hosts/soho/v11/users/briank/<ipython console>

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)

however, if I specify that the string is unicode, this all works:

Code:

In [47]: a = u'André'

In [48]: a.encode('ascii','replace')
Out[48]: 'Andr??'

The problem I'm having here is that I'm using os.walk to go through a bunch of files... some of those files have paths with unicode chars. I'm unclear on how to get the results of os.walk to be considered unicode such that the encode function works correctly.

At the end of the day, All these paths are going into a database. I just want everything that comes out of that database to be ascii & I'm having a hard time with that when unicode chars make their way into strings that think they are ascii.

Hope that made sense.

So... how can you convert ascii strings with unicode chars into actual ascii strings. Pardon me if my terminology is incorrect - see my first example for what I mean by "unicode chars in ascii strings"

crabboy 01-26-2009 09:46 PM

Continue discussion here:
http://www.linuxquestions.org/questi...3/#post3422175


All times are GMT -5. The time now is 06:39 AM.