简体   繁体   English

根据另一个数据框中的值从DataFrame中选择行,并根据第二个DataFrame使用值更新其中一个列

[英]Select rows from a DataFrame based on a values in another dataframe and updating one of the column with values according to the second DataFrame

I have two Dataframes df and df1. 我有两个数据帧df和df1。

Main DataFrame is as follows: 主DataFrame如下:
DF: DF:

    start   end price
0   A   Z   1
1   B   Y   2
2   C   X   3
3   A   Z   4
4   D   W   5

Second DataFrame: 第二个DataFrame:
DF1: DF1:

start   end price
    0   A   Z   100
    1   B   Y   200

I want the main dataframe df to update the values in 'price' columns based on the start and end in df1. 我希望主数据帧df根据df1中的开头和结尾更新'price'列中的值。 it should update column value for all the rows having the same start and end as in df1. 它应该更新具有与df1相同的开始和结束的所有行的列值。 DF: DF:

start   end price
0   A   Z   100
1   B   Y   200
2   C   X   3
3   A   Z   100
4   D   W   5

(all AZ and BY in df should get updated). (df中的所有AZ和BY都应该更新)。 Is there anyway I can get this output ? 无论如何我能得到这个输出吗? In reality the datframes have more columns but I want to update only one column(eg.'Price'). 实际上,数据帧有更多列,但我想只更新一列(例如''价格')。

First, you can merge: 首先,您可以合并:

s = df1.merge(df2, left_on=['start', 'end'], right_on=['start', 'end'], how='left')

Then you can fillna and index your desired columns: 然后,您可以fillna并索引所需的列:

s.assign(price=s.price_y.fillna(s.price_x))[['start', 'end', 'price']]

  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0

Using update 使用update

df=df.set_index(['start','end'])
df.update(df1.set_index(['start','end']))
df.reset_index()
Out[99]: 
  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0

merge

df.drop('price', 1).merge(df1, 'left').fillna(df)

  start end  price
0     A   Z  100.0
1     B   Y  200.0
2     C   X    3.0
3     A   Z  100.0
4     D   W    5.0

  1. I'm going to merge on ['start', 'end'] and that pesky price is going to get in my way. 我要在['start', 'end']上合并,那个讨厌的price会妨碍我。 So, I drop it. 所以,我放弃它。
  2. I need to preserve df index because I have that repeat of 'A' and 'Z' . 我需要保留df索引,因为我重复了'A''Z' So, I use a 'left' merge 所以,我使用'left' merge
  3. Now my missing elements can be filled back in with df 现在我的遗失元素可以用df填充

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 基于列值的 DataFrame 中的 select 行? - select rows from a DataFrame based on column values? Pandas:根据另一个数据框中的值更新数据框中的多列 - Pandas : Updating multiple column in a dataframe based on values from another dataframe PySpark:如果在第二个 dataframe 中找不到列值,则将行从一个 dataframe 移到另一个 - PySpark: Moving rows from one dataframe into another if column values are not found in second dataframe Pandas数据框根据查询数据框中的值选择行,然后根据列值选择其他条件 - Pandas Dataframe Select rows based on values from a lookup dataframe and then another condition based on column value Python DataFrame - Select dataframe rows based on values in a column of same dataframe - Python DataFrame - Select dataframe rows based on values in a column of same dataframe Python DataFrame - 根据另一个数据帧中的值选择数据帧行 - Python DataFrame - Select dataframe rows based on values in another dataframe How to filter the rows of a dataframe based on the presence of the column values in a separate dataframe and append columns from the second dataframe - How to filter the rows of a dataframe based on the presence of the column values in a separate dataframe and append columns from the second dataframe 根据另一个 dataframe 的列值打印一个 dataframe 的列值 - print column values of one dataframe based on the column values of another dataframe 根据第二个 dataframe 中的行设置 Pandas 一个 dataframe 中的值 - Set values in Pandas one dataframe based on rows in second dataframe 通过将另一列与第二个DataFrame进行比较,替换一列中的值 - Replace values from one column by comparing another column to a second DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM