Python pyodbc weird unicode bytecode

Question

I am expanding an in-house icinga plugin in python, that checks for SW expirations in a Racktables DB (mysql utf8).

The query I am running returns the name of the SW and the expiration date in epoch, then compares it to the actual date + a threshold. Previously the check would just check for SW expiration dates, now I need it to return the name of the software as well.

The query in pyodbc returns the right values for int , but returns unrecognizable bytecode for strings. Example:

mysql> SELECT a.uint_value, b.name FROM AttributeValue a, Object b WHERE a.object_id = b.id AND a.attr_id = 24 AND uint_value < (unix_timestamp(now())+40000000);
+------------+----------------------------+
| uint_value | CONVERT(b.name USING utf8) |
+------------+----------------------------+
| 1461974400 | Communigate                |
+------------+----------------------------+
1 row in set (0.00 sec)

But in python:

query="SELECT a.uint_value, b.name FROM AttributeValue a, Object b WHERE a.object_id = b.id AND a.attr_id = 24 AND uint_value < (unix_timestamp(now())+%d)" % wrange

con_string = '''DRIVER=MySQL;SERVER={0};PORT={1};UID={2};PWD={3};DATABASE={4};OPTION=3''' . format(options.host,options.port,options.user,options.password,options.database)
con = pyodbc.connect(con_string)
cur = con.cursor()
cur.execute(query)
rows = cur.fetchall()
pprint rows
[(1461974400, u'\U006f0043\U006d006d\U006e0075\U00670069\U00740061')]

I have tried to convert that bytecode with encode() and decode() to no avail.

I have also tried some bytecodeconverter, but none could recognized this encoding.

I have also checked with repr() and type() rows[0][1] (that bytecode looked like it could be some exotic data structure):

repr(rows[0][1])
  u'\U006f0043\U006d006d\U006e0075\U00670069\U00740061'

print type(rows)
  <type 'list'>

print type(rows[0][1])
  <type 'unicode'>

I am now inclined to think that it is a pyodbc issue and not an encoding one. Any thoughts on this are welcome

Answer 1

For some strange reason, pyodbc is returning

the Unicode code points for pairs of characters as a single \\Unnnnnnnn entity,
with the order of the characters reversed within each pair, and
with the final 'e' missing, presumably because the string contained an odd number of characters.

\U006f0043\U006d006d\U006e0075\U00670069\U00740061
     o   C     m   m     n   u     g   i     t   a

In any case, using MySQL Connector/Python instead of pyodbc and MySQL Connector/ODBC appears to work around the issue.

Python pyodbc weird unicode bytecode

Question

1 answers

solution1
0 ACCPTED 2015-12-03 22:34:57

Python pyodbc weird unicode bytecode

Question

1 answers

solution1 0 ACCPTED 2015-12-03 22:34:57

solution1
0 ACCPTED 2015-12-03 22:34:57