简体   繁体   中英

Python - Find first and last index of consecutive NaN groups in series

I am trying to get a list of tuples with the first and last index of grouped NaNs. An example input could look like

import pandas as pd
import numpy as np
series = pd.Series([1,2,3,np.nan,np.nan,4,5,6,7,np.nan,8,9,np.nan,np.nan,np.nan])
get_nan_inds(series)

and the output should be

[(3, 5), (9, 10), (12, 15)]

The only similar question I could find doesn't solve my problem.

Alternative solution:

import pandas as pd
import numpy as np

series = pd.Series([1,2,3,np.nan,np.nan,4,5,6,7,np.nan,8,9,np.nan,np.nan,np.nan])

def get_nan_inds(series):
    is_null_diff = pd.isnull(pd.Series(list(series) + [False])).diff() #Need to add False at the end for the case when the last elemetn is null
    res = [i for i, x in enumerate(list(is_null_diff)) if x is True]
    res = [(a, b) for i, (a,b) in enumerate(zip(res, res[1:])) if i % 2 == 0]
    return res

get_nan_inds(series)

While wrting this question I came up with the following function in case someone else has a similar problem.

def get_nan_inds(series):
    ''' Obtain the first and last index of each consecutive NaN group.
    '''
    series = series.reset_index(drop=True)
    index = series[series.isna()].index.to_numpy()
    if len(index) == 0:
        return []
    indices = np.split(index, np.where(np.diff(index) > 1)[0] + 1)
    return [(ind[0], ind[-1] + 1) for ind in indices]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM