[英]Set df's column value based on several conditions from another df
I want to set the value to the dataframe based on values from another dataframe 我想根据另一个数据帧的值将值设置为数据帧
Example: 例:
df1 DF1
A | B | C |
100 20.1
100 21.3
100 22.0
100 23.6
100 24.0
100 25.8
df2 DF2
A | B | D
100 20 AC1
100 22 AC2
100 23 AC3
100 25 AC4
100 29 AC5
200 20 AC1
200 34 AC2
200 37 AC3
I want df1['C'] to have something like 我希望df1 ['C']有类似的东西
AC1
AC1
AC2
AC3
AC3
AC4
Ie df1['C'] = df2['D'].where((df2['A'] == df1['A']) & (df2['B'] < df1['B']))
即
df1['C'] = df2['D'].where((df2['A'] == df1['A']) & (df2['B'] < df1['B']))
You could pd.merge
and ffill
to fill missing values: 您可以
pd.merge
和ffill
来填充缺失的值:
df1['C'] = pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')['D']
Output 产量
+---+-----+----+-----+
| | A | B | C |
+---+-----+----+-----+
| 0 | 100 | 20 | AC1 |
| 1 | 100 | 21 | AC1 |
| 2 | 100 | 22 | AC2 |
| 3 | 100 | 23 | AC3 |
| 4 | 100 | 24 | AC3 |
| 5 | 100 | 25 | AC4 |
+---+-----+----+-----+
EDIT : explanation 编辑:解释
First we merge df1
and df2
on A
and B
columns: 首先,我们在
A
和B
列上合并df1
和df2
:
pd.merge(df1, df2, how='left', on = ['A', 'B'])
#output
+---+-----+----+-----+
| | A | B | C |
+---+-----+----+-----+
| 0 | 100 | 20 | AC1 |
| 1 | 100 | 21 | AC1 |
| 2 | 100 | 22 | AC2 |
| 3 | 100 | 23 | AC3 |
| 4 | 100 | 24 | AC3 |
| 5 | 100 | 25 | AC4 |
+---+-----+----+-----+
To fill missing values, we take leverage of the ffill
method ( see docs ): 为了填补缺失值,我们利用了
ffill
方法( 参见文档 ):
method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap
方法:{'backfill','bfill','pad','ffill',None},默认无无法在重建索引中填充孔的方法pad pad / ffill:将最后一次有效观察传播到下一个有效回填/ bfill:use NEXT有效观察填补空白
pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')
#output : missing values are filled as expected
+---+-----+----+------+-----+
| | A | B | C | D |
+---+-----+----+------+-----+
| 0 | 100 | 20 | None | AC1 |
| 1 | 100 | 21 | None | AC1 |
| 2 | 100 | 22 | None | AC2 |
| 3 | 100 | 23 | None | AC3 |
| 4 | 100 | 24 | None | AC3 |
| 5 | 100 | 25 | None | AC4 |
+---+-----+----+------+-----+
df1['C']
is just the D
column of the merged and filled dataframe, which is what we wanted df1['C']
只是合并和填充数据帧的D
列,这正是我们想要的
df1['C'] = pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')['D']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.