I am trying to retrieve some data from MySQL and I have problems reading the data. The column datatype is varchar with utf8_general-ci. I tried decoding it but it doesn't work. So, I want to remove those non utf8 characters since I don't need those non utf8 characters.
#This is the line causing the problem:
line: ((123, 'Classical Musicï¼\x8c', 69),)
conn = db.cursor()
conn.execute(sql)
data = conn.fetchall()
for line in data:
for x in line:
print(x)
Error code received
UnicodeEncodeError: 'charmap' codec can't encode character '\x8c' in position 17
I have tried decode('utf-8') but I get another error.
conn = db.cursor()
conn.execute(sql)
data = conn.fetchall()
for line in data:
for x in line:
print(x[1].decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'
Mojibake and double-encoding, plus mangling by Python.
Start over. Make everything utf8 -- text, connections, CHARACTER SET
, html header.
If you still have problems, come back; hopefully your code will be close enough to correct for us to prescribe a cure.
Meanwhile, read more of the threads around here; simpler versions of the mess abound.
C3AF C2BB C2BF
was supposed a fancy comma, correct? The utf8 hex should have been EFBC8C
. What process generated that comma?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.