how to correctly compare unicode string from psycopg2 in Python?

Question

I have a problem with comparing a UTF-8 string obtained from PostgreSQL database:

>>> db_conn = psycopg2.connect("dbname='foo' user='foo' host='localhost' password='xxx'")
>>> db_cursor = db_conn.cursor()
>>> sql_com = ("""SELECT my_text FROM table WHERE id = 1""")
>>> db_cursor.execute(sql_com)
>>> sql_result = db_cursor.fetchone()
>>> db_conn.commit()
>>> db_conn.close()
>>> a = sql_result[0]
>>> a
u'M\xfcnchen'
>>> type(a)
<type 'unicode'>
>>> print a
München
>>> b = u'München'
>>> type(b)
<type 'unicode'>
>>> print b
MÃ¼nchen
>>> a == b
False

I am really confused why is this so, I can someone tell me how should I compare a string with an Umlaut from the database to another string, so the comparison is true? My database is UTF8:

postgres@localhost:$ psql -l
        List of databases
   Name    |  Owner   | Encoding 
-----------+----------+----------
 foo       | foo      | UTF8

Answer 1

This is clearly a problem with locale of your console.

u"München" is u'M\\xfcnchen' in Unicode and 'M\\xc3\\xbcnchen' in UTF-8. That latter is your MÃ¼nchen if taken as ISO8859-1 or CP1252.

Psycopg2 seems to supply you with correct Unicode values, as it should.

Answer 2

If you type

b = 'München'

What do you get from type(b) ??

Maybe you don't need to literally transform the string into unicode text as Python will automatically note this.

EDIT: I get this from my python CLI:

>>> b = u'München'
>>> b
u'M\xfcnchen'
>>> print b
München

While you are gettin' your print result in a different encoding

how to correctly compare unicode string from psycopg2 in Python?

Question

2 answers

solution1
3 2011-01-19 18:31:36

solution2
1 2011-01-19 17:47:03

how to correctly compare unicode string from psycopg2 in Python?

Question

2 answers

solution1 3 2011-01-19 18:31:36

solution2 1 2011-01-19 17:47:03

solution1
3 2011-01-19 18:31:36

solution2
1 2011-01-19 17:47:03