[英]Putting values in a column in pandas dataframe from another dataframe based on multiple condition check
I have two dataframes df1
and df2
. 我有两个数据帧
df1
和df2
。 I want to put a column new_id
in df1
with values from df2
. 我想在
df1
添加一个列new_id
,其值来自df2
。
s = {'id': [4735,46,2345,8768,807,7896],
'st': ['a', 'a', 'd', 'e', 'f', 'a'],
'rd' : ['CU', 'SU', 'NU', 'NU', 'W', 'CU'],
'cm' : ['m', 'm', 'm', 'm', 'm','m']}
df1 = pd.DataFrame(s)
df1
id st rd cm
0 4735 a CU m
1 46 a SU m
2 2345 d NU m
3 8768 e NU m
4 807 f W m
5 7896 a CU m
s2 = {'id': [1234,4567,1357,2367,8765, 8796, 5687, 4565, 7865],
'st': ['a', 'a', 'd', 'd', 'f', 'f','e' ,'e','a'],
'rd' : ['CU', 'SU', 'NU', 'W', 'W','NU','W','CU','W'],
'cm' : ['s', 's', 's', 's', 's','s','s','s','s']}
df2 = pd.DataFrame(s2)
df2
id st rd cm
0 1234 a CU s
1 4567 a SU s
2 1357 d NU s
3 2367 d W s
4 8765 f W s
5 8796 f NU s
6 5687 e W s
7 4565 e CU s
8 7865 a W s
I wanted the values in new_id
to be put in df1
from id
column of df2
where the st
value should be same and rd
value should be different. 我希望
new_id
的值从df2
id
列放入df1
,其中st
值应该相同, rd
值应该不同。
and once a value is picked from df2 that value should not be used again. 一旦从df2中选取一个值,就不应该再次使用该值。 How can I do it in pandas
我怎么能在熊猫里做到这一点
I am expecting the result: 我期待结果:
id st rd cm new_id
0 4735 a CU m 4567
1 46 a SU m 1234
2 2345 d NU m 2367
3 8768 e NU m 5687
4 807 f W m 8796
5 7896 a CU m 7865
Use np.equal.outer
comparison to get this cross-data-frame match, and np.argmax
to retrieve the indexes. 使用
np.equal.outer
比较来获得此跨数据帧匹配,使用np.argmax
来检索索引。
comp = np.equal.outer(df1.st, df2.st) & ~np.equal.outer(df1.rd, df2.rd)
df1['new_id'] = df2.id[np.argmax(comp, axis=1)].tolist()
id st rd cm new_id
0 4735 a CU m 4567
1 46 a SU m 1234
2 2345 d NU m 2367
3 8768 e NU m 5687
4 807 f W m 8796
How about this? 这个怎么样?
df3 = df2.copy()
def cond(row):
cond = ((df3['st'] == row['st']) & (df3['rd'] != row['rd']))
tmp = df3.loc[cond, 'id']
val = tmp.iloc[0]
idx = tmp[tmp == val].index[0]
df3.drop(idx, inplace=True)
return val
df1.assign(new_id=df1.apply(cond, axis=1))
id st rd cm new_id
0 4735 a CU m 4567
1 46 a SU m 1234
2 2345 d NU m 2367
3 8768 e NU m 5687
4 807 f W m 8796
5 7896 a CU m 7865
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.