简体   繁体   中英

drop all rows after first occurance of NaN in specific column (pandas)

I am trying to use the dropna function in pandas. I would like to use it for a specific column.

I can only figure out how to use it to drop NaN if ALL rows have ALL NaN values.

I have a dataframe (see below) that I would like to drop all rows after the first occurance of an NaN in a specific column, column "A"

current code, only works if all row values are NaN.

data.dropna(axis = 0, how = 'all')
data

Original Dataframe

    data = pd.DataFrame({"A": (1,2,3,4,5,6,7,"NaN","NaN","NaN"),"B": (1,2,3,4,5,6,7,"NaN","9","10"),"C": range(10)})
    data


    A   B   C
0   1   1   0
1   2   2   1
2   3   3   2
3   4   4   3
4   5   5   4
5   6   6   5
6   7   7   6
7   NaN NaN 7
8   NaN 9   8
9   NaN 10  9

What I would like the output to look like:

    A   B   C
0   1   1   0
1   2   2   1
2   3   3   2
3   4   4   3
4   5   5   4
5   6   6   5
6   7   7   6

Any help on this is appreciated. Obviously I am would like to do it in the cleanest most efficient way possible.

Thanks!

use iloc + argmax

data.iloc[:data.A.isnull().values.argmax()]

     A  B  C
0  1.0  1  0
1  2.0  2  1
2  3.0  3  2
3  4.0  4  3
4  5.0  5  4
5  6.0  6  5
6  7.0  7  6

or with a different syntax

top_data = data[:data['A'].isnull().argmax()]

Re: accepted answer. If column in question has no NaNs, argmax returns 0 and thus df[:argmax] will return an empty dataframe.

Here's my workaround:

max_ = data.A.isnull().argmax()  
max_ = len(data) if max_ == 0 else max_
top_data = data[:max_]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM