Trying to reuse a working Python code from Mac to Windows. The code compresses a utf8 string using gzip and inserts the output as a blob using SQLAlchemy.
However I get the following error after the insertion:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 15: character maps to <undefined>
The relevant section:
from sqlalchemy import *
import zlib
pcaxis_table = Table('pcaxis_data', meta, autoload=True, autoload_with=engine)
try:
response = urllib2.urlretrieve(url_source)
except Exception as e:
print url_source
raise e
infile = response.read()
px_file = infile.decode('cp1252').encode('utf-8')
cmpstr = zlib.compress(px_file)
#out = StringIO.StringIO()
#with gzip.GzipFile(fileobj=out, mode="w") as f:
# f.write(px_file)
ins = pcaxis_table.insert(values = {'TableSQL':tableSQL,
'zip_file':cmpstr, #out.getvalue()
})
ins.execute()
Trace... (it fails when trying to decode the blob as cp1252)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda\Lib\site-packages\sqlalchemy\sql\base.py", line 386, in execute
return e._execute_clauseelement(self, multiparams, params)
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\base.py", line 1758, in _execute_clauseelement
return connection._execute_clauseelement(elem, multiparams, params)
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\base.py", line 958, in _execute_context
context)
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\base.py", line 1162, in _handle_dbapi_exception
util.reraise(*exc_info)
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\base.py", line 951, in _execute_context
context)
File "C:\Anaconda\Lib\site-packages\sqlalchemy\engine\default.py", line 436, in do_execute
cursor.execute(statement, parameters)
File "C:\Anaconda\Lib\site-packages\pymysql\cursors.py", line 100, in execute
query = query % escaped_args
File "C:\Anaconda\lib\encodings\cp1252.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 6: character maps to <undefined>
and the MySQL table:
create table pcaxis_data(
id int NOT NULL AUTO_INCREMENT,
TableSQL varchar(25),
zip_file BLOB,
inserttime TIMESTAMP,
PRIMARY KEY (id)
);
The problem is with .decode('cp1252')
. Windows-1252 codepage doesn't use all bytes (so for example byte 8f
is not used and fails to decode). You can use latin1
instead.
Is response
actually a Windows-1252 text? If it is not, decoding it as such makes no sense.
zlib.compress
takes a bytestring parameter and response
is a bytestring, you can compress it directly, without re-encoding.
Just solved the issue. How? Upgrading pymysql from circa 0.6.0 to 0.6.3. What was the problem? The pymysql driver tries to escape the binary data by doing a conversion to unicode. The byte \\x08 does not map to unicode using UTF8 nor Latin1. Thats why this failed.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.