繁体   English   中英

列上的python数据帧计数器

[英]python dataframe counter on a column

我在dataframe中的列x只有0和1.我想创建变量y,它开始计数零并在1进入x时重置。 我收到一个错误“系列的真值是模棱两可的。”

count=1                   
countList=[0]

for x in df['x']:   
if  df['x'] == 0:     
    count = count + 1
    df['y']= count
else:     
    df['y'] = 1     
    count = 1

首先不要在熊猫中循环,因为缓慢,如果存在一些矢量化解决方案。

我认为需要连续计算0值:

df = pd.DataFrame({'x':[1,0,0,1,1,0,1,0,0,0,1,1,0,0,0,0,1]})

a = df['x'].eq(0)
b = a.cumsum()
df['y'] = (b-b.mask(a).ffill().fillna(0).astype(int))
print (df)

    x  y
0   1  0
1   0  1
2   0  2
3   1  0
4   1  0
5   0  1
6   1  0
7   0  1
8   0  2
9   0  3
10  1  0
11  1  0
12  0  1
13  0  2
14  0  3
15  0  4
16  1  0

细节+说明

#compare by zero
a = df['x'].eq(0)
#cumulative sum of mask
b = a.cumsum()
#replace Trues to NaNs
c = b.mask(a)
#forward fill NaNs
d = b.mask(a).ffill()
#First NaNs to 0 and cast to integers
e = b.mask(a).ffill().fillna(0).astype(int)
#subtract from cumulative sum Series
y = b - e
df = pd.concat([df['x'], a, b, c, d, e, y], axis=1, keys=('x','a','b','c','d','e', 'y'))
print (df)
    x      a   b     c     d   e  y
0   0   True   1   NaN   NaN   0  1
1   0   True   2   NaN   NaN   0  2
2   0   True   3   NaN   NaN   0  3
3   1  False   3   3.0   3.0   3  0
4   1  False   3   3.0   3.0   3  0
5   0   True   4   NaN   3.0   3  1
6   1  False   4   4.0   4.0   4  0
7   0   True   5   NaN   4.0   4  1
8   0   True   6   NaN   4.0   4  2
9   0   True   7   NaN   4.0   4  3
10  1  False   7   7.0   7.0   7  0
11  1  False   7   7.0   7.0   7  0
12  0   True   8   NaN   7.0   7  1
13  0   True   9   NaN   7.0   7  2
14  0   True  10   NaN   7.0   7  3
15  0   True  11   NaN   7.0   7  4
16  1  False  11  11.0  11.0  11  0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM