简体   繁体   English

根据多列的条件从另一个 dataframe 更新列的某些值

[英]Update certain values of a column from another dataframe based on condition of multiple columns

My dataframe1:我的数据框1:

id    filler       ent    seg    val     text
1     M,0-10       CP     BEC    20       abc
2     M,10-20      D      BWC    30       abc
3     Y,0-10       CP     CCD    40       abc
4     Y,10-20      D      CFC    50       abc

dataframe2:数据框2:

id    filler       ent    seg    val     text
1     M,0-10       CP     BEC    20       xyz
2     Y,10-20      D      CFC    50       xyz

need to create a result dataframe:需要创建一个结果 dataframe:

id    filler       ent    seg    val     text
1     M,0-10       CP     BEC    20       xyz
2     M,10-20      D      BWC    30       abc
3     Y,0-10       CP     CCD    40       abc
4     Y,10-20      D      CFC    50       xyz

where its kind of checks whether all the columns apart from text have same value then updates dataframe1 by dataframe 2 my dataframe1 has 100 rows and dataframe2 has 20 rows.它检查除文本之外的所有列是否具有相同的值,然后通过 dataframe 2 更新 dataframe1 我的 dataframe1 有 100 行,dataframe2 有 20 行。

You can perform a left merge of dataframe2 onto dataframe1, and use the indicator column to find values that need updated in dataframe1.您可以在 dataframe1 上执行 dataframe2 的左合并,并使用指示符列查找 dataframe1 中需要更新的值。

columns = ['id','filler','ent','seg','val','text']

df1 = pd.DataFrame([
    [1, 'M,0-10','CP','BEC',20, 'abc'],
    [2,'M,10-20','D','BWC',30,'abc'],
    [3,'Y,0-10','CP','CCD',40,'abc'],
    [4,'Y,10-20','D','CFC',50,'abc'],
], columns=columns)

df2 = pd.DataFrame([
    [1,'M,0-10','CP','BEC',20,'xyz'],
    [4,'Y,10-20','D','CFC',50,'xyz'],
], columns=columns)

Merge dataframe2 on dataframe1, with indicator column在 dataframe1 上合并 dataframe2,带有指示符列

columns_merge = [x for x in columns if x!='text']
updated = df1.merge(df2, on=columns_merge, how='left', indicator=True)

Compare and set contents that need updated based on indicator column.根据指标列比较并设置需要更新的内容。

same = updated['_merge']=='both'
updated.loc[same,'text_x'] = updated.loc[same,'text_y']

Drop & rename columns删除和重命名列

updated.drop(columns=['text_y','_merge'], inplace=True)
updated.rename(columns = {'text_x': 'text'}, inplace=True)

updated =更新=

   id   filler ent  seg  val text
0   1   M,0-10  CP  BEC   20  xyz
1   2  M,10-20   D  BWC   30  abc
2   3   Y,0-10  CP  CCD   40  abc
3   4  Y,10-20   D  CFC   50  xyz

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据另一个数据帧的列值的条件将数据添加到数据帧中的列 - Adding data to columns in a dataframe based on condition on column values of another dataframe 如何根据条件从另一个 Dataframe 更新 Dataframe 值 - How to update a Dataframe values from an another Dataframe based on condition 基于多个条件检查,将值放在pandas dataframe中的列中,来自另一个数据帧 - Putting values in a column in pandas dataframe from another dataframe based on multiple condition check 根据条件从另一个数据帧的值向数据帧添加新列 - Adding a new column to a dataframe from the values of another dataframe based on a condition 根据条件从另一个 dataframe 值替换列的值 - Python - Replace values of a column from another dataframe values based on a condition - Python 如何使用赋值运算符在另一个条件下在DataFrame的多个列中放置更新值? - How to in-place update values in multiple columns in a DataFrame on condition from another using assignment operator? 如何根据使用 Pyspark 的条件从另一个表更新 Spark DataFrame 表的列值 - How to update Spark DataFrame Column Values of a table from another table based on a condition using Pyspark 如何根据列条件将选定的列从数据框中复制到另一个 - How to copy selected columns from a dataframe to another based on a column condition 根据时间标准将值从一个数据框中的多个列传输到另一个数据框中的新列 - Transferring values from multiple columns in a dataframe to a new column in another dataframe, based on time-criterion 根据来自另一个 DataFrame 的值更新 pandas 列中的值 - Update values in pandas columns based on values from another DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM