简体   繁体   English

数据帧中连续负值的计数器

[英]Counter for consecutive negative values in a Data-frame

I need to implement a counter, which does the counting as shown in the below OUTPUT.我需要实现一个计数器,它的计数如下面的 OUTPUT 所示。 It checks the past values of "data" column for negative values.它检查“数据”列的过去值是否为负值。

    data    output
0   -1      Nan        //  since there are no past values for data: count=NaN 
1   -2       1         //-1, so count= 1
2    4       2         //-2,-1   count=2
3    12      0         //         count=0
4   -22      0         //         count=0    
5   -12      1         //-22      count=1          
6   -7       2         // -22,-12   count=2     
7   -5       3         // -7,-22,-12    count=3
8   -33      4         // -5,7,-22,-12    count=4
9    2       5         // -33,-5,7,-22,-12    count=5
10   2       1         //        count=0

MY CODE我的代码

import pandas as pd
import talib
import numpy as np     

df=pd.DataFrame()
df["data"]=[-1,-2,4,12,-22,-12,-7,-5,-33,2,2]
print(df)


c=0
for y in [0,len(ff)-1] : 
    for z in [1,10]:
        if (ff["data"].shift(-z)).any()<=0:c=c+1
        else:c
        if (ff["data"].shift(-z)).any()>0:break
    count["dd"]=c

OUTPUT needed:需要 OUTPUT:

在此处输入图像描述

I am pretty unsure how to write the "Nan" (not very great myself), but here is a code that seems to do what you asked for:我很不确定如何写“Nan”(我自己不是很好),但这里的代码似乎可以满足您的要求:

df = pd.DataFrame()
df["data"] = [-1, -2, 4, 12, -22, -12, -7, -5, -22, 2, 2]
def generateOutput(df):
    a = [0]
    for i in range(len(df) - 1):
        if df["data"][i] < 0:
            a.append(a[-1] + 1)
        else:
            a.append(0)
    df["output"] = a
    return df

print(df)
df = generateOutput(df)
print(df)

And here is my output when launched the program这是我启动程序时的 output

    data
0     -1
1     -2
2      4
3     12
4    -22
5    -12
6     -7
7     -5
8    -22
9      2
10     2
    data  output
0     -1       0
1     -2       1
2      4       2
3     12       0
4    -22       0
5    -12       1
6     -7       2
7     -5       3
8    -22       4
9      2       5
10     2       0

One-liner:单线:

df.data.lt(0).groupby(df.data.lt(0).diff().ne(0).cumsum()).cumsum().shift()

Expanded version:扩展版:

import pandas as pd

df = pd.DataFrame()
df["data"] = [-1, -2, 4, 12, -22, -12, -7, -5, -33, 2, 2]

subzero = df.data < 0  # == df.data.lt(0)
# 0      True
# 1      True
# 2     False
# 3     False
# 4      True
# 5      True
# 6      True
# 7      True
# 8      True
# 9     False
# 10    False
# Name: data, dtype: bool

# We need the `cumsum` of subzero,
# but it should be calculated for each True group separately.
# The following array can be used to group consecutive boolean elements.
by = subzero.diff().cumsum()
# 0     NaN
# 1     0.0
# 2     1.0
# 3     1.0
# 4     2.0
# 5     2.0
# 6     2.0
# 7     2.0
# 8     2.0
# 9     3.0
# 10    3.0
# Name: data, dtype: object

# Decide about the group of the first element.
# (The `.ne(0)` in the one-liner does the same job)
by[0] = 0.0 if by[1] == 0.0 else -1.0

result = subzero.groupby(by).cumsum().shift(1)
# 0     NaN
# 1     1.0
# 2     2.0
# 3     0.0
# 4     0.0
# 5     1.0
# 6     2.0
# 7     3.0
# 8     4.0
# 9     5.0
# 10    0.0
# Name: data, dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM