简体   繁体   English

基于多个条件检查,将值放在pandas dataframe中的列中,来自另一个数据帧

[英]Putting values in a column in pandas dataframe from another dataframe based on multiple condition check

I have two dataframes df1 and df2 . 我有两个数据帧df1df2 I want to put a column new_id in df1 with values from df2 . 我想在df1添加一个列new_id ,其值来自df2

s = {'id': [4735,46,2345,8768,807,7896],
     'st': ['a', 'a', 'd', 'e', 'f', 'a'], 
     'rd' : ['CU', 'SU', 'NU', 'NU', 'W', 'CU'], 
     'cm' : ['m', 'm', 'm', 'm', 'm','m']}
df1 = pd.DataFrame(s)

df1

     id st  rd cm
0  4735  a  CU  m
1    46  a  SU  m
2  2345  d  NU  m
3  8768  e  NU  m
4   807  f   W  m
5  7896  a  CU  m


s2 = {'id': [1234,4567,1357,2367,8765, 8796, 5687, 4565, 7865],
     'st': ['a', 'a', 'd', 'd', 'f', 'f','e' ,'e','a'], 
     'rd' : ['CU', 'SU', 'NU', 'W', 'W','NU','W','CU','W'], 
     'cm' : ['s', 's', 's', 's', 's','s','s','s','s']}
df2 = pd.DataFrame(s2)

df2

     id st  rd cm
0  1234  a  CU  s
1  4567  a  SU  s
2  1357  d  NU  s
3  2367  d   W  s
4  8765  f   W  s
5  8796  f  NU  s
6  5687  e   W  s
7  4565  e  CU  s
8  7865  a   W  s

I wanted the values in new_id to be put in df1 from id column of df2 where the st value should be same and rd value should be different. 我希望new_id的值从df2 id列放入df1 ,其中st值应该相同, rd值应该不同。

and once a value is picked from df2 that value should not be used again. 一旦从df2中选取一个值,就不应该再次使用该值。 How can I do it in pandas 我怎么能在熊猫里做到这一点

I am expecting the result: 我期待结果:

     id st  rd cm  new_id
0  4735  a  CU  m  4567
1    46  a  SU  m  1234
2  2345  d  NU  m  2367
3  8768  e  NU  m  5687
4   807  f   W  m  8796
5  7896  a  CU  m  7865

Use np.equal.outer comparison to get this cross-data-frame match, and np.argmax to retrieve the indexes. 使用np.equal.outer比较来获得此跨数据帧匹配,使用np.argmax来检索索引。

comp = np.equal.outer(df1.st, df2.st) & ~np.equal.outer(df1.rd, df2.rd)
df1['new_id'] = df2.id[np.argmax(comp, axis=1)].tolist()

    id      st  rd  cm  new_id
0   4735    a   CU  m   4567
1   46      a   SU  m   1234
2   2345    d   NU  m   2367
3   8768    e   NU  m   5687
4   807     f   W   m   8796

How about this? 这个怎么样?

df3 = df2.copy()

def cond(row):
    cond = ((df3['st'] == row['st']) & (df3['rd'] != row['rd']))
    tmp = df3.loc[cond, 'id']
    val = tmp.iloc[0]
    idx = tmp[tmp == val].index[0]
    df3.drop(idx, inplace=True)
    return val

df1.assign(new_id=df1.apply(cond, axis=1))

     id st  rd cm  new_id
0  4735  a  CU  m    4567
1    46  a  SU  m    1234
2  2345  d  NU  m    2367
3  8768  e  NU  m    5687
4   807  f   W  m    8796
5  7896  a  CU  m    7865

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas:根据另一个数据框中的值更新数据框中的多列 - Pandas : Updating multiple column in a dataframe based on values from another dataframe Pandas数据框根据查询数据框中的值选择行,然后根据列值选择其他条件 - Pandas Dataframe Select rows based on values from a lookup dataframe and then another condition based on column value 根据条件从另一个数据帧的值向数据帧添加新列 - Adding a new column to a dataframe from the values of another dataframe based on a condition 根据另一列中的条件在 Pandas 数据框中设置值 - Setting Values in Pandas Dataframe Based on Condition in Another Column 如何基于多列从另一个 dataframe 中提取 pandas dataframe? - how to extract pandas dataframe from another dataframe based on multiple column? 子集根据另一个数据帧的值在多个列上进行pandas数据帧 - Subset pandas dataframe on multiple columns based on values from another dataframe 根据条件用一个python pandas dataframe列的值替换为另一个python pandas dataframe列的值 - Substitute the values of one python pandas dataframe column by values from another based on a condition Pandas 根据来自另一个 dataframe 的计数和条件创建新列 - Pandas Create new column based on a count and a condition from another dataframe 根据另一列中的条件从 Pandas 数据框中提取值 - Extract Value From Pandas Dataframe Based On Condition in Another Column Pandas 数据框根据另一列的条件创建新行 - Pandas dataframe create new rows based on condition from another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM