[英]Get error ProgrammingError when insert to MySQL database using Python
I have a dataframe that have about 200M rows with example like this:我有一个 dataframe 有大约 200M 行,例如:
Date tableName attributeName
29/03/2019 tableA attributeA
....
and I want to save the dataframe to a table in MySQL database.我想将 dataframe 保存到 MySQL 数据库中的表中。 This is what I've tried to insert the dataframe to table:
这是我尝试将 dataframe 插入表的内容:
def insertToTableDB(tableName,dataFrame):
mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
cursor = mysqlCon.cursor()
for index, row in dataFrame.iterrows():
myList =[row.Date, row.tableName, row.attributeName]
query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
cursor.execute(query,myList)
print(myList)
try:
mysqlCon.commit()
cursor.close()
print("Done")
return tableName,dataFrame
except:
cursor.close()
print("Fail")
This code successful when I inserted a dataframe that have 2M rows.当我插入具有 2M 行的 dataframe 时,此代码成功。 But, when I inserted dataframe that have 200M rows, I got error like this:
但是,当我插入具有 200M 行的 dataframe 时,出现如下错误:
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 442, in _handle_result
raise errors.get_exception(packet)
ProgrammingError: Unknown column 'nan' in 'field list'
My dataframe doesn't have 'nan' value.我的 dataframe 没有“南”值。 Could someone help me to solve this problem?
有人可以帮我解决这个问题吗?
Thank you so much.太感谢了。
try these steps试试这些步骤
dropna
dropna
删除包含 nan 的行nan
in string.nan
行。df.dropna(inplace=True)
df[(df['Date']!='nan') & (df['tableName']!='nan') &(df['attributeName']!='nan')]
df1 = df.where((pd.notnull(df)), None)
replace everywhere 'NaN' for the string 'empty':将字符串 'empty' 处处替换为 'NaN':
df = df.replace(np.nan, 'empty')
Remember to:记得:
import numpy as np
df = df.astype(str) solves the problem for me - assuming you've already set up your table schema df = df.astype(str) 为我解决了这个问题——假设你已经设置了你的表模式
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.