简体   繁体   English

根据另一个df中的几个条件设置df的列值

[英]Set df's column value based on several conditions from another df

I want to set the value to the dataframe based on values from another dataframe 我想根据另一个数据帧的值将值设置为数据帧

Example: 例:

df1 DF1

A   |  B  |  C  |               
100   20.1        
100   21.3
100   22.0
100   23.6
100   24.0
100   25.8

df2 DF2

A   |  B  |  D

100   20     AC1
100   22     AC2 
100   23     AC3
100   25     AC4
100   29     AC5
200   20     AC1
200   34     AC2
200   37     AC3

I want df1['C'] to have something like 我希望df1 ['C']有类似的东西

AC1
AC1
AC2
AC3
AC3
AC4

Ie df1['C'] = df2['D'].where((df2['A'] == df1['A']) & (df2['B'] < df1['B'])) df1['C'] = df2['D'].where((df2['A'] == df1['A']) & (df2['B'] < df1['B']))

You could pd.merge and ffill to fill missing values: 您可以pd.mergeffill来填充缺失的值:

df1['C'] = pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')['D']

Output 产量

+---+-----+----+-----+
|   |  A  | B  |  C  |
+---+-----+----+-----+
| 0 | 100 | 20 | AC1 |
| 1 | 100 | 21 | AC1 |
| 2 | 100 | 22 | AC2 |
| 3 | 100 | 23 | AC3 |
| 4 | 100 | 24 | AC3 |
| 5 | 100 | 25 | AC4 |
+---+-----+----+-----+

EDIT : explanation 编辑:解释

First we merge df1 and df2 on A and B columns: 首先,我们在AB列上合并df1df2

pd.merge(df1, df2, how='left', on = ['A', 'B'])
#output
+---+-----+----+-----+
|   |  A  | B  |  C  |
+---+-----+----+-----+
| 0 | 100 | 20 | AC1 |
| 1 | 100 | 21 | AC1 |
| 2 | 100 | 22 | AC2 |
| 3 | 100 | 23 | AC3 |
| 4 | 100 | 24 | AC3 |
| 5 | 100 | 25 | AC4 |
+---+-----+----+-----+

To fill missing values, we take leverage of the ffill method ( see docs ): 为了填补缺失值,我们利用了ffill方法( 参见文档 ):

method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap 方法:{'backfill','bfill','pad','ffill',None},默认无无法在重建索引中填充孔的方法pad pad / ffill:将最后一次有效观察传播到下一个有效回填/ bfill:use NEXT有效观察填补空白

pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')
#output : missing values are filled as expected 
+---+-----+----+------+-----+
|   |  A  | B  |  C   |  D  |
+---+-----+----+------+-----+
| 0 | 100 | 20 | None | AC1 |
| 1 | 100 | 21 | None | AC1 |
| 2 | 100 | 22 | None | AC2 |
| 3 | 100 | 23 | None | AC3 |
| 4 | 100 | 24 | None | AC3 |
| 5 | 100 | 25 | None | AC4 |
+---+-----+----+------+-----+

df1['C'] is just the D column of the merged and filled dataframe, which is what we wanted df1['C']只是合并和填充数据帧的D列,这正是我们想要的

df1['C'] = pd.merge(df1, df2, how='left', on = ['A', 'B']).fillna(method='ffill')['D']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM