简体   繁体   English

Python Pyodbc Unicode问题

[英]Python pyodbc Unicode issue

I have a string variable res which I have derived from a pyodbc cursor as shown in the bottom. 我有一个从pyodbc游标派生的字符串变量res,如底部所示。 The table test has a single row with data ä whose unicode codepoint is u'\\xe4' . test只有一行包含数据ä的unicode码点是u'\\xe4'

The Result I get is 我得到的结果是

>>> res,type(res)
('\xe4', <type 'str'>)

Whereas the result I should have got is. 而我应该得到的结果是。

>>> res,type(res)
(u'\xe4', <type 'unicode'>)

I tried adding charset as utf-8 to my pyodbc connect string as shown below. 我尝试将charset作为utf-8添加到我的pyodbc连接字符串中,如下所示。 The result was now correctly set as a unicode but the codepoint was for someother string which could be due to a possible bug in the pyodbc driver. 现在已将结果正确设置为unicode,但代码点是用于其他字符串 ,这可能是由于pyodbc驱动程序中的错误所致。

conn = pyodbc.connect(DSN='datbase;charset=utf8',ansi=True,autocommit=True)
>>> res,type(res)
(u'\ua4c3', <type 'unicode'>)

Actual code 实际代码

import pyodbc
pyodbc.pooling=False
conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)
cursor = conn.cursor()
cur = cursor.execute('SELECT col1 from test')
res = cur.fetchall()[0][0]
print(res)

Additional details Database: Teradata pyodbc version: 2.7 其他详细信息数据库:Teradata pyodbc版本:2.7

So How do I now either 那我现在该怎么办

1) cast ('\\xe4', <type 'str'>) to (u'\\xe4', <type 'unicode'>) (is it possible to do this without unintentional side-effects?) 1)将('\\xe4', <type 'str'>) (u'\\xe4', <type 'unicode'>)(u'\\xe4', <type 'unicode'>) (是否可以在没有意外副作用的情况下做到这一点?)

2) resolve the pyodbc/unixodbc issue 2)解决pyodbc / unixodbc问题

This is something I think is best handled with Python, instead of fiddling with pyodbc.connect arguments and driver-specific connection string attributes. 我认为这是最好用Python处理的,而不是摆弄pyodbc.connect参数和特定于驱动程序的连接字符串属性。

'\\xe4' is a Latin-1 encoded string representing the unicode ä character. '\\xe4'是表示'\\xe4'ä字符的Latin-1编码字符串。

To explicitly decode the pyodbc result in Python 2.7: 要在Python 2.7中显式解码pyodbc结果:

>>> res = '\xe4'
>>> res.decode('latin1'), type(res.decode('latin1'))
(u'\xe4', <type 'unicode'>)
>>> print res.decode('latin1')
ä

Python 3.x does this for you (the str type includes unicode characters ): Python 3.x为您做到了这一点( str类型包括unicode字符 ):

>>> res = '\xe4'
>>> res, type(res)
('ä', <class 'str'>)
>>> print(res)
ä

For Python 3, try this: 对于Python 3,请尝试以下操作:

After conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True) conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)

Place this: 放置:

conn.setdecoding(pyodbc.SQL_CHAR, encoding='utf8') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf8') conn.setencoding(encoding='utf8')

or 要么

conn.setdecoding(pyodbc.SQL_CHAR, encoding='iso-8859-1') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='iso-8859-1') conn.setencoding(encoding='iso-8859-1')

etc... 等等...

Python 2: Python 2:

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') cnxn.setencoding(str, encoding='utf-8') cnxn.setencoding(unicode, encoding='utf-8')

etc... 等等...

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='encode-foo-bar') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='encode-foo-bar') cnxn.setencoding(str, encoding='encode-foo-bar') cnxn.setencoding(unicode, encoding='encode-foo-bar')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM