在PANDAS中第一个非NaN之后保持行

Question

I have a dataframe in PANDAS with two columns and an index(dates). 我在PANDAS中有一个数据框，有两列和一个索引（日期）。 I would like to keep the rows after the first common non NaN element. 我想在第一个常见的非NaN元素之后保留行。 For example, initially I have: 例如，最初我有：

                    A      B        
        Index
        1/1/1950    NaN    5
        2/1/1950    7      NaN
        3/1/1950    9      NaN
        4/1/1950    NaN    6
        5/1/1950    4      15
        6/1/1950    2      21
        7/1/1950    NaN    5
        7/1/1950    12     5
        7/1/1950    5      NaN

and I would like to get 而且我想得到

                    A      B        
        Index
        5/1/1950    4      15
        6/1/1950    2      21
        7/1/1950    NaN    5
        7/1/1950    12     5
        7/1/1950    5      NaN

since 5/1/1950 is the first time both A and B are not NaN, and I would like to keep all data after it. 自1950年5月1日以来，A和B都不是NaN的第一次，我想保留所有数据。 Thank you for the help. 感谢您的帮助。

Answer 1

You can call notnull on the df and test if all values in the row are True using all(axis=1) , we can then call argmax to get the first True index label and slice the df using loc : 您可以在df上调用notnull并测试行中的所有值是否为True使用all(axis=1) ，然后我们可以调用argmax来获取第一个True索引标签并使用loc对df进行切片：

In [37]:
df.loc[df.notnull().all(axis=1).argmax():]

Out[37]:
             A     B
Index               
5/1/1950   4.0  15.0
6/1/1950   2.0  21.0
7/1/1950   NaN   5.0
7/1/1950  12.0   5.0
7/1/1950   5.0   NaN

here is a breakdown: 这是一个细分：

In [38]:
df.notnull()

Out[38]:
              A      B
Index                 
1/1/1950  False   True
2/1/1950   True  False
3/1/1950   True  False
4/1/1950  False   True
5/1/1950   True   True
6/1/1950   True   True
7/1/1950  False   True
7/1/1950   True   True
7/1/1950   True  False

In [39]:
df.notnull().all(axis=1)

Out[39]:
Index
1/1/1950    False
2/1/1950    False
3/1/1950    False
4/1/1950    False
5/1/1950     True
6/1/1950     True
7/1/1950    False
7/1/1950     True
7/1/1950    False
dtype: bool

In [40]:
df.notnull().all(axis=1).argmax()

Out[40]:
'5/1/1950'

EDIT 编辑

As pointed out by @DSM it is more robust to use df.loc[df.notnull().all(axis=1).cummax()] as this will handle duplicate index values 正如@DSM所指出的那样，使用df.loc[df.notnull().all(axis=1).cummax()]会更加健壮，因为它会处理重复的索引值

在PANDAS中第一个非NaN之后保持行

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-04-11 13:04:57

在PANDAS中第一个非NaN之后保持行

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-04-11 13:04:57

解决方案1
4 已采纳 2016-04-11 13:04:57