[英]Error while using copy_from in psycopg2 while inserting to a postgresql database
每當我想將 Pandas 數據框中的數據插入到 postgresql 數據庫中時,我都會收到此錯誤error: extra data after last expected column CONTEXT: COPY recommendations, line 1: "0,4070,"[5963, 8257, 9974, 7546, 11251, 5203, 102888, 8098, 101198, 10950]""
數據框由三列組成,第一列和第二列是整數類型,第三列是整數列表。
我使用下面的這個函數在 PostgreSQL 中創建了一個表
def create_table(query: str) -> None:
"""
:param query: A string of the query to create table in the database
:return: None
"""
try:
logger.info("Creating the table in the database")
conn = psycopg2.connect(host=HOST, dbname=DATABASE_NAME, user=USER, password=PASSWORD, port=PORT)
cur = conn.cursor()
cur.execute(query)
conn.commit()
logger.info("Successfully created a table in the database using this query {}".format(query))
return
except (Exception, psycopg2.Error) as e:
logger.error("An error occurred while creating a table using the query {} with exception {}".format(query, e))
finally:
if conn is not None:
conn.close()
logger.info("Connection closed!")
傳遞給這個函數的查詢是這樣的:
create_table_query = '''CREATE TABLE Recommendations
(id INT NOT NULL,
applicantId INT NOT NULL,
recommendation INTEGER[],
PRIMARY KEY(id),
CONSTRAINT applicantId
FOREIGN KEY(applicantId)
REFERENCES public."Applicant"(id)
ON DELETE CASCADE
ON UPDATE CASCADE
); '''
然后我使用下面的函數將數據框復制到 postgres 中創建的表。
def copy_from_file(df: pd.DataFrame, table: str = "recommendations") -> None:
"""
Here we are going save the dataframe on disk as
a csv file, load the csv file
and use copy_from() to copy it to the table
"""
conn = psycopg2.connect(host=HOST, dbname=DATABASE_NAME, user=USER, password=PASSWORD, port=PORT)
# Save the dataframe to disk
tmp_df = "./tmp_dataframe.csv"
df.to_csv(tmp_df, index_label='id', header=False)
f = open(tmp_df, 'r')
cursor = conn.cursor()
try:
cursor.copy_from(f, table, sep=",")
conn.commit()
except (Exception, psycopg2.DatabaseError) as error:
os.remove(tmp_df)
logger.error("Error: %s" % error)
conn.rollback()
cursor.close()
logger.info("copy_from_file() done")
cursor.close()
os.remove(tmp_df)
然后我仍然得到這個error: extra data after last expected column CONTEXT: COPY recommendations, line 1: "0,4070,"[5963, 8257, 9974, 7546, 11251, 5203, 102888, 8098, 101198, 10950]""
請提供有關如何解決此問題的任何建議? 謝謝
copy_from
使用文本格式,而不是 csv 格式。 您告訴它使用,
作為分隔符,但這不會改變它嘗試使用的保護方法。 所以引號內的逗號不被視為受保護,它們被視為字段分隔符,所以當然有太多了。
我認為您需要使用copy_expert
並告訴它使用csv
格式。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.