简体   繁体   English

python:错误处理带有Unicode数据的有序字典

[英]python : error handling Ordered dict with unicode data

My script migrates data from MySQL to mongodb. 我的脚本将数据从MySQL迁移到mongodb。 It runs perfectly well when there are no unicode columns included. 当不包含unicode列时,它运行得很好。 But throws me below error when OrgLanguages column is added. 但是当添加OrgLanguages列时,使我OrgLanguages错误OrgLanguages

    mongoImp = dbo.insert_many(odbcArray)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/collection.py", line 711, in insert_many
    blk.execute(self.write_concern.document)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 493, in execute
    return self.execute_command(sock_info, generator, write_concern)
  File "/home/lrsa/.local/lib/python2.7/site-packages/pymongo/bulk.py", line 319, in execute_command
    run.ops, True, self.collection.codec_options, bwc)
bson.errors.InvalidStringData: strings in documents must be valid UTF-8: 'Portugu\xeas do Brasil, ?????, English, Deutsch, Espa\xf1ol latinoamericano, Polish'

My code: 我的代码:

import MySQLdb, MySQLdb.cursors, sys, pymongo, collections

odbcArray=[]
mongoConStr = '192.168.10.107:36006'
sqlConnect = MySQLdb.connect(host = "54.175.170.187", user = "testuser", passwd = "testuser", db = "testdb", cursorclass=MySQLdb.cursors.DictCursor)
mongoConnect = pymongo.MongoClient(mongoConStr)

sqlCur = sqlConnect.cursor()
sqlCur.execute("SELECT ID,OrgID,OrgLanguages,APILoginID,TransactionKey,SMTPSpeed,TimeZoneName,IsVideoWatched FROM organizations")

dbo = mongoConnect.eaedw.mysqlData
tuples = sqlCur.fetchall()

for tuple in tuples:
    odbcArray.append(collections.OrderedDict(tuple))

mongoImp = dbo.insert_many(odbcArray)

sqlCur.close()
mongoConnect.close()
sqlConnect.close()
sys.exit()

Above script migraates data perfectly when tried without OrgLanguages column in the SELECT query. 上面的脚本在没有SELECT查询中的OrgLanguages列的情况下尝试完全迁移数据。 To overcome this, I have tried to use the OrderedDict() in another way but gives me a different type of error 为了克服这个问题,我尝试以另一种方式使用OrderedDict() ,但给了我不同类型的错误
Changed Code: 更改的代码:

for tuple in tuples:
    doc = collections.OrderedDict()
    doc['oid'] = tuple.OrgID
    doc['APILoginID'] = tuple.APILoginID
    doc['lang'] = unicode(tuple.OrgLanguages)
    odbcArray.append(doc)
mongoImp = dbo.insert_many(odbcArray)

Error Received: 收到错误:

Traceback (most recent call last):
  File "pymsql.py", line 19, in <module>
    doc['oid'] = tuple.OrgID
AttributeError: 'dict' object has no attribute 'OrgID'

您的MySQL连接返回的字符采用与UTF-8不同的编码,UTF-8是所有BSON字符串必须使用的编码。尝试使用原始代码,但将charset='utf8'传递给MySQLdb.connect

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM