Python Pyodbc Unicode问题

Question

I have a string variable res which I have derived from a pyodbc cursor as shown in the bottom. 我有一个从pyodbc游标派生的字符串变量res，如底部所示。 The table test has a single row with data ä whose unicode codepoint is u'\\xe4' . 表test只有一行包含数据ä的unicode码点是u'\\xe4' 。

The Result I get is 我得到的结果是

>>> res,type(res)
('\xe4', <type 'str'>)

Whereas the result I should have got is. 而我应该得到的结果是。

>>> res,type(res)
(u'\xe4', <type 'unicode'>)

I tried adding charset as utf-8 to my pyodbc connect string as shown below. 我尝试将charset作为utf-8添加到我的pyodbc连接字符串中，如下所示。 The result was now correctly set as a unicode but the codepoint was for someother string ꓃ which could be due to a possible bug in the pyodbc driver. 现在已将结果正确设置为unicode，但代码点是用于其他字符串꓃ ，这可能是由于pyodbc驱动程序中的错误所致。

conn = pyodbc.connect(DSN='datbase;charset=utf8',ansi=True,autocommit=True)
>>> res,type(res)
(u'\ua4c3', <type 'unicode'>)

Actual code 实际代码

import pyodbc
pyodbc.pooling=False
conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)
cursor = conn.cursor()
cur = cursor.execute('SELECT col1 from test')
res = cur.fetchall()[0][0]
print(res)

Additional details Database: Teradata pyodbc version: 2.7 其他详细信息数据库：Teradata pyodbc版本：2.7

So How do I now either 那我现在该怎么办

1) cast ('\\xe4', <type 'str'>) to (u'\\xe4', <type 'unicode'>) (is it possible to do this without unintentional side-effects?) 1）将('\\xe4', <type 'str'>) (u'\\xe4', <type 'unicode'>)为(u'\\xe4', <type 'unicode'>) （是否可以在没有意外副作用的情况下做到这一点？）

2) resolve the pyodbc/unixodbc issue 2）解决pyodbc / unixodbc问题

Answer 1

This is something I think is best handled with Python, instead of fiddling with pyodbc.connect arguments and driver-specific connection string attributes. 我认为这是最好用Python处理的，而不是摆弄pyodbc.connect参数和特定于驱动程序的连接字符串属性。

'\\xe4' is a Latin-1 encoded string representing the unicode ä character. '\\xe4'是表示'\\xe4'码ä字符的Latin-1编码字符串。

To explicitly decode the pyodbc result in Python 2.7: 要在Python 2.7中显式解码pyodbc结果：

>>> res = '\xe4'
>>> res.decode('latin1'), type(res.decode('latin1'))
(u'\xe4', <type 'unicode'>)
>>> print res.decode('latin1')
ä

Python 3.x does this for you (the str type includes unicode characters ): Python 3.x为您做到了这一点（ str类型包括unicode字符）：

>>> res = '\xe4'
>>> res, type(res)
('ä', <class 'str'>)
>>> print(res)
ä

Answer 2

For Python 3, try this: 对于Python 3，请尝试以下操作：

After conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True) 在conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)

Place this: 放置：

conn.setdecoding(pyodbc.SQL_CHAR, encoding='utf8') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf8') conn.setencoding(encoding='utf8')

or 要么

conn.setdecoding(pyodbc.SQL_CHAR, encoding='iso-8859-1') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='iso-8859-1') conn.setencoding(encoding='iso-8859-1')

etc... 等等...

Python 2: Python 2：

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') cnxn.setencoding(str, encoding='utf-8') cnxn.setencoding(unicode, encoding='utf-8')

etc... 等等...

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='encode-foo-bar') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='encode-foo-bar') cnxn.setencoding(str, encoding='encode-foo-bar') cnxn.setencoding(unicode, encoding='encode-foo-bar')

Python Pyodbc Unicode问题

问题描述

2 个解决方案

解决方案1
2 已采纳 2015-04-07 17:28:25

解决方案2
2 2018-02-27 21:03:04

Python Pyodbc Unicode问题

问题描述

2 个解决方案

解决方案1 2 已采纳 2015-04-07 17:28:25

解决方案2 2 2018-02-27 21:03:04

解决方案1
2 已采纳 2015-04-07 17:28:25

解决方案2
2 2018-02-27 21:03:04