[英]Conditional update of pandas dataframe from another dataframe
I have a master dataframe with two sets of values:我有一个主 dataframe 有两组值:
df1 = pd.DataFrame({'id1': [1, 1, 2, 2],
'dir1': [True, False, True, False],
'value1': [55, 40, 84, 31],
'id2': [3, 3, 4, 4],
'dir2': [True, False, False, True],
'value2': [60, 30, 7, 15]})
id1 dir1 value1 id2 dir2 value2
0 1 True 55 3 True 60
1 1 False 40 3 False 30
2 2 True 84 4 False 7
3 2 False 31 4 True 15
I then have an update dataframe that looks like this:然后我有一个更新 dataframe 看起来像这样:
df2 = pd.DataFrame({'id': [1, 2, 3, 4],
'value': [21, 22, 23, 24]})
id value
0 1 21
1 2 22
2 3 23
3 4 24
I want to update df1 with the new values of df2 but only where dirX is True.我想用 df2 的新值更新 df1,但仅限于 dirX 为 True 的地方。 Data should then look like this:
数据应如下所示:
id1 dir1 value1 id2 dir2 value2
0 1 True *21 3 True *23
1 1 False 40 3 False 30
2 2 True *22 4 False 7
3 2 False 31 4 True *24
Any idea if something like this is even possible?知道这样的事情是否可能吗? I tried looking at.update but I could not get it to work.
我尝试查看.update,但无法使其正常工作。 I'm fairly new to python and only coding at 23:00, so maybe I'm just not as sharp as I need to be.
我对 python 还很陌生,只在 23:00 编码,所以也许我没有我需要的那么敏锐。
I agree with Thales' answer.我同意泰利斯的回答。 First, you merge df2 with df1 based on id1:
首先,根据 id1 将 df2 与 df1 合并:
df = df1.merge(df2, left_on='id1', right_on='id')
Then, you replace value1
based on dir1
with value
:然后,将基于
dir1
的value1
替换为value
:
df.value1 = np.where(df.dir1 == True, df.value, df.value1)
Then, you drop the extra columns然后,您删除额外的列
df = df.drop(['id', 'value'],axis=1)
Then, you merge df2 with df1 based on id2
:然后,根据
id2
将 df2 与 df1 合并:
df = df.merge(df2, left_on='id2', right_on='id')
Do the same replacing, but for value2
做同样的替换,但对于
value2
df.value2 = np.where(df.dir2 == True, df.value, df.value2)
Then, drop the extra columns:然后,删除额外的列:
df = df.drop(['id', 'value'],axis=1)
The resulting dataframe will look like:生成的 dataframe 将如下所示:
id1 dir1 value1 id2 dir2 value2
0 1 True 21 3 True 23
1 1 False 40 3 False 30
2 2 True 22 4 False 7
3 2 False 31 4 True 24
Try to use np.where function from numpy.尝试使用来自 numpy 的 np.where function。
Maybe something like this:也许是这样的:
df_1['value1'] = np.where(df_1['dir2'] == True, df_2['value'], df_1['value1'])
Maybe you'll need some adjustments or some merges, but I think this will help you to find a solution.也许您需要一些调整或一些合并,但我认为这将帮助您找到解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.