I am reading a file line by line, storing the data i need in variables and then passing them in an insert statement to store them in my db row by row. When the program hits on a unicode emoticon mysql throughs the following exception:
Traceback (most recent call last):
File "import_json.py", line 73, in <module>
crs.execute(ins_statement,parameters)
File "/venv/lib/python3.4/site-packages/MySQLdb/cursors.py", line 250, in execute
self.errorhandler(self, exc, value)
File "/venv/lib/python3.4/site-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
raise errorvalue
File "/venv/lib/python3.4/site-packages/MySQLdb/cursors.py", line 247, in execute
res = self._query(query)
File "/venv/lib/python3.4/site-packages/MySQLdb/cursors.py", line 411, in _query
rowcount = self._do_query(q)
File "/venv/lib/python3.4/site-packages/MySQLdb/cursors.py", line 374, in _do_query
db.query(q)
File "/venv/lib/python3.4/site-packages/MySQLdb/connections.py", line 277, in query
_mysql.connection.query(self, query)
_mysql_exceptions.OperationalError: (1366, "Incorrect string value: '\\xF0\\x9F\\x87\\xB5\\xF0\\x9F...' for column 'primary_geo' at row 1")
I have ran into several problems with the encoding so far and i solved them by setting up mysqls encoding to utf8mb4
like this:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_general_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
and also added in my code :
conn.set_character_set('utf8')
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
However, things like unicode emoticons come up, and who knows what else is going to happen while importing and parsing millions of lines. So i would like to allow cursor.execute(insert_stmnt)
to add all proper lines and ignore the faulty ones, since i only need to store a sample of my raw data.
I tried :
` try:
crs.execute(insert_stmnt,args)
except (MySQLdb.Error):
continue`
but this just completes the program without writing anything to my db.
Never work when you don't get enough sleep. Especially when you are a newbie like me. I didn't commit.
try:
crs.execute(insert_stmnt,args)
conn.commit()
except (MySQLdb.Error):
continue`
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.