简体   繁体   English

如何使用 Pandas 在某些行中加入两个更新数据帧?

[英]How can I join two dataframes with update in some rows, using Pandas?

I'm new to pandas and I would like to know how I can join two files and update existing lines, taking into account a specific column.我是 pandas 的新手,我想知道如何加入两个文件并更新现有行,同时考虑到一个特定的列。 The files have thousands of lines.这些文件有数千行。 For example:例如:

  • Df_1: Df_1:

     AB C D 1 2 5 4 2 2 6 8 9 2 2 1

Now, my table 2 has exactly the same columns, and I want to join the two tables replacing some rows that may be in this table and also in table 1 but where there were changes / updates in column C, and add the new lines that exist in this second table (df_2), for example:现在,我的表 2 具有完全相同的列,我想加入这两个表,替换可能在该表和表 1 中但在 C 列中发生更改/更新的一些行,并添加新行存在于第二个表 (df_2) 中,例如:

  • Df_2: Df_2:

     AB C D 2 2 7 8 9 2 3 1 3 4 6 7 1 2 3 4

So, the result I want is the union of the two tables and their update in a few rows, in a specific column, like this:所以,我想要的结果是两个表的并集以及它们在几行中的更新,在一个特定的列中,如下所示:

  • Df_result: df_结果:

     AB C D 1 2 5 4 2 2 7 8 9 2 3 1 3 4 6 7 1 2 3 4

How can I do this with the merge or concatenate function?如何通过合并或连接 function 来做到这一点? Or is there another way to get the result I want?还是有其他方法可以获得我想要的结果?

Thank you!谢谢!

You need to have at least one column as a reference, I mean, to know what needs to change to do the update.我的意思是,您需要至少有一列作为参考,以了解需要更改哪些内容才能进行更新。

Assuming that in your case it is "A" and "B" in this case.假设在您的情况下它是“A”和“B”。

import pandas as pd
ref = ['A','B']
df_result = pd.concat([df_1, df_2], ignore_index = True)
df_result = df_result.drop_duplicates(subset=ref, keep='last')

Here a real example.这里是一个真实的例子。

d = {'col1': [1, 2, 3], 'col2': ["a", "b", "c"], 'col3': ["aa", "bb", "cc"]}
df1 = pd.DataFrame(data=d)
d = {'col1': [1, 4, 5], 'col2': ["a", "d", "f"], 'col3': ["dd","ee", "ff"]}
df2 = pd.DataFrame(data=d)

df_result = pd.concat([df1, df2], ignore_index=True)

df_result = df_result.drop_duplicates(subset=['col1','col2'], keep='last')
df_result

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何加入 pandas 中具有不同行数和不同列的两个数据帧? - How can I join two dataframes in pandas that have different no of rows and different columns? 如何在熊猫中使用索引连接两个数据框? - How to join two dataframes using index in pandas? 如何将两个数据框连接在一起 - How can I join two dataframes together 我可以使用带有 pandas 的正则表达式在两个数据帧之间执行左连接/合并吗? - Can I perform a left join/merge between two dataframes using regular expressions with pandas? 为什么在使用python pandas时我无法加入两个数据框? - why i failed to join two dataframes when using python pandas? 如何使用熊猫在共享值上将这两个DataFrame联接在一起? - How do I join these two DataFrames on shared values in a column using pandas? 如何在Pandas的一部分列中找到两个数据框中的行的“集合差异”? - How can I find the “set difference” of rows in two dataframes on a subset of columns in Pandas? 如何使用两个 Pandas 数据帧创建一个新数据帧,其中包含来自一个数据帧的特定行? - How can I use two pandas dataframes to create a new dataframe with specific rows from one dataframe? 如何使用 pandas itertuples 比较两个数据帧之间的行? - How can I use pandas itertuples to compare rows between two dataframes? 使用Pandas MultiIndex连接两个数据框 - Using a pandas MultiIndex to join two dataframes
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM