简体   繁体   English

Pandas 和 Numpy 连续非 Nan 值

[英]Pandas and Numpy consecutive non Nan values

I'm trying to use np.where to count consecutive non-NaN values longer than a certain length as shown below:我正在尝试使用 np.where 来计算长度超过一定长度的连续非 NaN 值,如下所示:

eg If there are more than 3 consecutive non-NaN values then return True.例如,如果有超过 3 个连续的非 NaN 值,则返回 True。

Would appreciate any help!将不胜感激任何帮助!

value价值 consecutive连续的
nan False错误的
nan False错误的
1 1 False错误的
1 1 False错误的
nan False错误的
4 4 True真的
2 2 True真的
3 3 True真的
nan False错误的
nan False错误的
1 1 True真的
3 3 True真的
3 3 True真的
5 5 True真的

The idea is to create groups by testing missing values and mapping using Series.map with Series.value_counts to have only rows with non NaNs filtered by inverted mask ~m :这个想法是通过使用Series.mapSeries.value_counts测试缺失值和映射来创建组,以便只有非 NaN 的行被反向掩码过滤~m

#convert values to numeric
df['value'] = df['value'].astype(float)

m = df['value'].isna()
s = m.cumsum()

N = 3
df['new'] = s.map(s[~m].value_counts()).ge(N) & ~m
print (df)
    value  consecutive    new
0     NaN        False  False
1     NaN        False  False
2     1.0        False  False
3     1.0        False  False
4     NaN        False  False
5     4.0         True   True
6     2.0         True   True
7     3.0         True   True
8     NaN        False  False
9     NaN        False  False
10    1.0         True   True
11    3.0         True   True
12    3.0         True   True
13    5.0         True   True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM