UPSERT 在 Oracle 使用 Python Pandas

Question

我想使用cx_Oracle和pandas读取包含已验证数据集的 csv，并根据主键将 csv 中的每条记录插入或更新到 Oracle 表中。 如果csv中的主键列值为null，则该行中的每一列将作为oracle中的新行插入，但如果主键列值不是null（已存在于oracle表中），那么我想仅更新 csv 中值不是 null（空）的列的值。

例如，如果我的 csv 看起来像这样（ID 作为主键）：

ID,FIRSTNAME,LASTNAME,AGE,SALARY
null,John,Smith,30,40000
2,James,Johnson,15,null

我需要在第一行中插入每一列，以便我的 oracle 表为其生成一个新 ID，但我只需要更新 oracle 表中 ID = 2 的行的 FIRSTNAME、LASTNAME、AGE。

我将如何 go 关于 1. 为新行生成新 ID（从表中现有的最高 ID 递增）和 2. 根据 csv 中的哪些列不是 null 选择要更新的列？

请注意，在更新/插入数据库之前，数据将导入到 dataframe 中。

Answer 1

您可以在数据库中使用MERGE语句，例如

import cx_Oracle
import numpy as np
import pandas as pd
from pandas import DataFrame

con = cx_Oracle.connect('uname/pwd@host:port/service_name')
cursor = con.cursor()

dataset = pd.DataFrame({'id': [np.NaN,2],
                        'firstname': ['John','James'],
                        'lastname': ['Smith','Johnson'],
                        'age': [30,15],
                        'salary': [40000,np.NaN]
                       })

df_list = dataset.fillna('').values.tolist()
sql ='merge into employee using dual on ( id = :1 )';
sql+=' when matched then update set firstname = :2, lastname = :3, age = :4, salary = :5';  
sql+=' when not matched then insert values( :1, :2, :3, :4, :5 )';

for i in range(len(df_list)):
    cursor.execute(sql,df_list[i])

con.commit()

考虑将数据导入到dataframe

或者

直接从当前的.csv文件使用

import cx_Oracle
import numpy as np
import pandas as pd
from pandas import DataFrame

con = cx_Oracle.connect('uname/pwd@host:port/service_name')
cursor = con.cursor()

source_file = 'C:\\path\\To\\yourFile.csv'
df = pd.read_csv(source_file, sep = ",", dtype=str)
dataset = df.fillna('').values.tolist()    

sql ='merge into employee using dual on ( id = :1 )';
sql+=' when matched then update set firstname = :2, lastname = :3, age = :4, salary = :5';  
sql+=' when not matched then insert values( :1, :2, :3, :4, :5 )';

cursor.executemany(sql, dataset)

con.commit()

UPSERT 在 Oracle 使用 Python Pandas

问题描述

1 个解决方案

解决方案1
0 2022-04-27 14:06:53

UPSERT 在 Oracle 使用 Python Pandas

问题描述

1 个解决方案

解决方案1 0 2022-04-27 14:06:53

解决方案1
0 2022-04-27 14:06:53