Pandas：用最后一个可用值和一个标志填充 null 值

Question

I am looking for a logic to produce an output/update values in value column based on flag Y. Notice the second N in bold.我正在寻找一种逻辑来根据标志 Y 在 value 列中生成输出/更新值。注意第二个 N 粗体。 We won't be filling values for next two Ys since the last value is N and it's null.我们不会为接下来的两个 Y 填充值，因为最后一个值是 N，它是 null。 If N has a value we can ffill next Y row.如果 N 有一个值，我们可以填充下一个 Y 行。

I have tried using df_latest.loc[(df_latest['flag'] == 'Y'), 'value'] =df_latest['value'].fillna(method='ffill') This logic doesn't cover the scenario when N is null and it forward fills all the preceding the NUll row.我试过使用 df_latest.loc[(df_latest['flag'] == 'Y'), 'value'] =df_latest['value'].fillna(method='ffill') 这个逻辑不包括场景当 N 为 null 并且它向前填充 NUll 行之前的所有内容时。

flag    value   new_val

Y           1       1 

Y           2       2

Y           NaN     2

N           3       3

Y           NaN     3

Y           5       5

N           NaN     NaN

Y           NaN     NaN

Y           NaN     NaN

N           6       6

Y           NaN     6

Y           NaN     6

Y           NaN     6

Y           NaN     6

Y           NaN     6

Answer 1

We can use GroupBy.ffill to fill by groups, so whenever flag == N and value is null it will not be filled until value is other than null, to fill only when flag is Y you can use the commented code.我们可以使用GroupBy.ffill来按组填充，所以每当flag == N并且value null 时，它才会被填充，直到值不是 null 时才填充，仅当 flag 为Y时才填充，您可以使用注释代码。

blocks = (df['flag'].eq('N') & df['value'].isnull()).cumsum()
df['new_val'] = df['value'].groupby(blocks).ffill()

# if you want fill only if flag is Y
#df['new_val'] = df['value'].fillna(df['value'].groupby(blocks)
#                                              .ffill()          
#                                              .where(df['flag'].eq('Y'))
#                                       )

print(df)

Output Output

   flag  value  new_val
0     Y    1.0      1.0
1     Y    2.0      2.0
2     Y    NaN      2.0
3     N    3.0      3.0
4     Y    NaN      3.0
5     Y    5.0      5.0
6     N    NaN      NaN
7     Y    NaN      NaN
8     Y    NaN      NaN
9     N    6.0      6.0
10    Y    NaN      6.0
11    Y    NaN      6.0
12    Y    NaN      6.0
13    Y    NaN      6.0
14    Y    NaN      6.0

Pandas：用最后一个可用值和一个标志填充 null 值

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-08-02 16:07:25

Pandas：用最后一个可用值和一个标志填充 null 值

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-08-02 16:07:25

解决方案1
1 已采纳 2020-08-02 16:07:25