[英]Encoding UTF8 data within a table with latin1 character set
I have a [legacy] mysql table with character set of "latin-1"
but storing json information in "utf-8"
. 我有一个[旧版] mysql表,其字符集为
"latin-1"
但将json信息存储在"utf-8"
。 A user interface is connected to this table which shows the characters correctly. 用户界面连接到该表,该表正确显示了字符。 I need to update this table using a python script but I can't get rid of encoding hell.
我需要使用python脚本更新此表,但无法摆脱编码地狱。
On mysql shell I issue "select words from pip where id_pip=42"
and receive: 在mysql shell上,我发出
"select words from pip where id_pip=42"
并接收:
"ventilationsplåtslageri":{"day":"1000","hour":"200","min":"30"}
But when I tried to fetch it from database, I couldn't get the same encoding even though I try several different encodings. 但是,当我尝试从数据库中获取它时,即使尝试了几种不同的编码,也无法获得相同的编码。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import MySQLdb
import json
dbconn = MySQLdb.connect(host="host",port=3306,user="user",
passwd="pass",db="db", use_unicode=True, charset="utf8")
cursor1 = dbconn.cursor()
cursor1.execute("select words from pip where id_pip=42")
track = cursor1.fetchall()
print json.dumps(track, encoding="utf8" )
I tried many different configuarations on this code, eg I changed "use_unicode=False, charset="latin1"
with print json.dumps(filter_track, encoding="utf8" )
but I still get either "ventilationspl\Ã\¥tslageri\\"
or "ventilationspl\åtslageri\\"
and not what I want which is: "ventilationsplåtslageri"
I couldn't change the database and I need to update this field of database with sql update command, so I am afraid if I mess up the lagacy database. 我在此代码上尝试了许多不同的配置,例如,我用
print json.dumps(filter_track, encoding="utf8" )
更改了"use_unicode=False, charset="latin1"
print json.dumps(filter_track, encoding="utf8" )
但仍然得到了"ventilationspl\Ã\¥tslageri\\"
或"ventilationspl\åtslageri\\"
而不是我想要的是: "ventilationsplÃ¥tslageri"
我无法更改数据库,我需要使用sql update命令更新数据库的此字段,因此如果我搞砸了时延性,我担心数据库。
I'm not sure if I understand your question, but... 我不确定我是否理解您的问题,但是...
If the content is being returned in Latin-1 and you want it in UTF-8 , I would assume that you'd first need to decode the content from Latin-1 and then encode it to UTF-8 . 如果内容以Latin-1返回并且您希望以UTF-8返回 ,则我认为您首先需要从Latin-1解码内容,然后将其编码为UTF-8 。
latin1_content.decode('latin1').encode('utf8')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.