Dataframe 向前填充直到列特定的最后一个有效索引

Question

How do I go from:我如何 go 来自：

[In]:   df = pd.DataFrame({
            'col1': [100, np.nan, np.nan, 100, np.nan, np.nan],
            'col2': [np.nan, 100, np.nan, np.nan, 100, np.nan]
        })
        df

[Out]:        col1    col2
        0      100     NaN
        1      NaN     100
        2      NaN     NaN
        3      100     NaN
        4      NaN     100
        5      NaN     NaN

To:到：

[Out]:        col1    col2
        0      100     NaN
        1      100     100
        2      100     100
        3      100     100
        4      NaN     100
        5      NaN     NaN

My current approach is a to apply a custom method that works on one column at a time:我目前的方法是应用一次在一列上工作的自定义方法：

[In]:   def ffill_last_valid(s):
            last_valid = s.last_valid_index()
            s = s.ffill()
            s[s.index > last_valid] = np.nan
            return s

        df.apply(ffill_last_valid)

But it seems like an overkill to me.但这对我来说似乎有点矫枉过正。 Is there a one-liner that works on the dataframe directly?是否有直接在 dataframe 上运行的单行程序？

Note on accepted answer:关于接受的答案的注释：

See the accepted answer from mozway below.请参阅下面mozway接受的答案。

I know it's a tiny dataframe but:我知道这是一个很小的 dataframe 但是：

Answer 1

You can ffill , then keep only the values before the last stretch of NaN with a combination of where and notna /reversed- cummax :您可以ffill ，然后仅保留最后一段 NaN 之前的值，结合使用where和notna /reversed cummax ：

out = df.ffill().where(df[::-1].notna().cummax())

variant:变体：

out = df.ffill().mask(df[::-1].isna().cummin())

Output: Output：

    col1   col2
0  100.0    NaN
1  100.0  100.0
2  100.0  100.0
3  100.0  100.0
4    NaN  100.0
5    NaN    NaN

`interpolate` : `interpolate` ：

In theory, df.interpolate(method='ffill', limit_area='inside') should work, but while both options work as expected separately, for some reason it doesn't when combined (pandas 1.5.2).从理论上讲， df.interpolate(method='ffill', limit_area='inside')应该可以工作，但是虽然这两个选项分别按预期工作，但由于某种原因它在组合时不起作用（pandas 1.5.2）。 This works with df.interpolate(method='zero', limit_area='inside') , though.不过，这适用于df.interpolate(method='zero', limit_area='inside') 。

Dataframe 向前填充直到列特定的最后一个有效索引

问题描述

1 个解决方案

解决方案1
3 已采纳 2023-01-16 14:39:23

`interpolate` : `interpolate` ：

Dataframe 向前填充直到列特定的最后一个有效索引

问题描述

1 个解决方案

解决方案1 3 已采纳 2023-01-16 14:39:23

interpolate : interpolate ：

解决方案1
3 已采纳 2023-01-16 14:39:23

`interpolate` : `interpolate` ：