繁体   English   中英

删除基于上一行的 pandas 行

[英]Remove pandas row that is based on previous row

我有以下 dataframe,其值应该会增加。 原来 dataframe 有一些未知值。

指数 价值
0 1
1
2
3 2
4
5
6
7 4
8
9
10 3
11 3
12
13
14
15 5

基于该值应该增加的假设,我想删除索引 10 和 11 处的值。这将是所需的 dataframe:

指数 价值
0 1
1
2
3 2
4
5
6
7 4
8
9
12
13
14
15 5

非常感谢

假设空单元格中有 NaN(如果没有,暂时用 NaN 替换它们),使用 boolean 索引:

# if not NaNs uncomment below
# and use s in place of df['value'] afterwards
# s = pd.to_numeric(df['value'], errors='coerce')

# is the cell empty?
m1 = df['value'].isna()

# are the values strictly increasing?
m2 = df['value'].ge(df['value'].cummax())

out = df[m1|m2]

Output:

    index  value
1       1    NaN
2       2    NaN
3       3    2.0
4       4    NaN
5       5    NaN
6       6    NaN
7       7    4.0
8       8    NaN
9       9    NaN
12     12    NaN
13     13    NaN
14     14    NaN
15     15    5.0
def del_df(df):
    
    df_no_na = df.dropna().reset_index(drop = True)

    num_tmp = df_no_na['value'][0]   # First value which is not NaN.
    
    del_index_list = []   # indicies to delete

    for row_index in range(1, len(df_no_na)):

        if df_no_na['value'][row_index] > num_tmp :    #Increasing
            num_tmp = df_no_na['value'][row_index]   # to compare following two values.
        
        else :   # Not increasing(same or decreasing)
            del_index_list.append(df_no_na['index'][row_index])   # index to delete
    
    df_goal = df.drop([df.index[i] for i in del_index_list])

    return df_goal

output:

    index  value
0       0    1.0
1       1    NaN
2       2    NaN
3       3    2.0
4       4    NaN
5       5    NaN
6       6    NaN
7       7    4.0
8       8    NaN
9       9    NaN
12     12    NaN
13     13    NaN
14     14    NaN
15     15    5.0

我希望它能满足你的问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM