使用 Python 插入 MySQL 数据库时出现错误 ProgrammingError

Question

I have a dataframe that have about 200M rows with example like this:我有一个 dataframe 有大约 200M 行，例如：

Date         tableName    attributeName
29/03/2019   tableA       attributeA
....

and I want to save the dataframe to a table in MySQL database.我想将 dataframe 保存到 MySQL 数据库中的表中。 This is what I've tried to insert the dataframe to table:这是我尝试将 dataframe 插入表的内容：

def insertToTableDB(tableName,dataFrame):
    mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
    cursor = mysqlCon.cursor()
    for index, row in dataFrame.iterrows():
        myList =[row.Date, row.tableName, row.attributeName]
        query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
        cursor.execute(query,myList)
        print(myList)
    try:
        mysqlCon.commit()
        cursor.close()        
        print("Done")
        return tableName,dataFrame
    except:
        cursor.close()
        print("Fail")

This code successful when I inserted a dataframe that have 2M rows.当我插入具有 2M 行的 dataframe 时，此代码成功。 But, when I inserted dataframe that have 200M rows, I got error like this:但是，当我插入具有 200M 行的 dataframe 时，出现如下错误：

File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))

File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))

File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 442, in _handle_result
raise errors.get_exception(packet)

ProgrammingError: Unknown column 'nan' in 'field list'

My dataframe doesn't have 'nan' value.我的 dataframe 没有“南”值。 Could someone help me to solve this problem?有人可以帮我解决这个问题吗？

Thank you so much.太感谢了。

Answer 1

try these steps试试这些步骤

drop rows containing nan using dropna使用dropna删除包含 nan 的行
Filter rows which not contains nan in string.过滤字符串中不包含nan行。
Convert nan into None将 nan 转换为 None

df.dropna(inplace=True)

df[(df['Date']!='nan') & (df['tableName']!='nan') &(df['attributeName']!='nan')]

df1 = df.where((pd.notnull(df)), None)

Answer 2

replace everywhere 'NaN' for the string 'empty':将字符串 'empty' 处处替换为 'NaN'：

df = df.replace(np.nan, 'empty')

Remember to:记得：

import numpy as np

Answer 3

df = df.astype(str) solves the problem for me - assuming you've already set up your table schema df = df.astype(str) 为我解决了这个问题——假设你已经设置了你的表模式

使用 Python 插入 MySQL 数据库时出现错误 ProgrammingError

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-07-29 06:42:52

解决方案2
1 2020-12-15 22:39:43

解决方案3
0 2022-08-26 17:54:39

使用 Python 插入 MySQL 数据库时出现错误 ProgrammingError

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-07-29 06:42:52

解决方案2 1 2020-12-15 22:39:43

解决方案3 0 2022-08-26 17:54:39

解决方案1
1 已采纳 2019-07-29 06:42:52

解决方案2
1 2020-12-15 22:39:43

解决方案3
0 2022-08-26 17:54:39