简体   繁体   中英

Pandas to_sql replace duplicates

I am making request to API every second with parameter since(to return changes since last request) I convert it to dataframe and would like quickly insert it into MySQL with replacement of duplicate rows something like this:

REPLACE INTO table (column1,column2...) VALUES (val1,val2...) 

I really like function DataFrame.to_sql but the problem is that it does not have replace duplicate rows option. The way I can see with DataFrame.to_sql is to drop table each time and recreate it with option if_exists: replace, but I think it will influence performance significantly. Can you advise what is the better way to insert data from dataframe with replacement of duplicate values?

如果您的 DF 不是那么大,您可以遍历它,生成INSERT ... ON DUPLICATE KEY UPDATE SQL 并在您的 MySQL 数据库中执行它们。

It seems there is no way to replace duplicates with DataFrame.to_sql in pandas. Hopefully they will integrate this function in future. I managed to find a post on how to ignore duplicates, but in my case I just decided to choose another approach and as @MaxU mentioned iterate through Dataframe and execute

REPLACE INTO table (column1,column2...) VALUES (val1,val2...)  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM