简体   繁体   English

熊猫中的编号序列

[英]Numbering Sequences in pandas

I have the following dataframe:我有以下数据框:

   Steps
0  False
1  False
2  True
3  True
4  True
5  False
6  True
7  True
8  False
9  False 
10 False
11 True

I would like to number the True sequences in an additional column:我想在附加列中对True序列进行编号:

   Steps  Numbered
0  False  0
1  False  0
2  True   1
3  True   1
4  True   1  
5  False  0
6  True   2
7  True   2
8  False  0
9  False  0
10 False  0
11 True   3

Filling the rows containing False is secondary.填充包含False的行是次要的。 Do you have any ideas?你有什么想法?

Chain shifted values by Series.shift by & for bitwise AND for counter by first True s and then set False rows to 0 by Series.where :通过Series.shift by & for bitwise AND for counter by first True链移位值,然后通过Series.whereFalse行设置为0

df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
                       .where(df['Steps'], 0))
print (df)
    Steps  Numbered
0   False         0
1   False         0
2    True         1
3    True         1
4    True         1
5   False         0
6    True         2
7    True         2
8   False         0
9   False         0
10  False         0
11   True         3

Solution working well if first value is True :如果第一个值为True ,则解决方案运行良好:

df['Numbered'] = ((df['Steps'] & ~df['Steps'].shift(fill_value=False)).cumsum()
                       .where(df['Steps'], 0))
print (df)
    Steps  Numbered
0    True         1
1   False         0
2    True         2
3    True         2
4    True         2
5   False         0
6    True         3
7    True         3
8   False         0
9   False         0
10  False         0
11   True         4

You can use the values that are both True and different from the previous one (using diff ) to initiate a cumsum :您可以使用既为 True 又不同于前一个值(使用diff )的值来启动cumsum

df['Numbered'] = (df['Steps']&df['Steps'].diff()).cumsum().where(df['Steps'], 0)

output:输出:

    Steps  Numbered
0   False         0
1   False         0
2    True         1
3    True         1
4    True         1
5   False         0
6    True         2
7    True         2
8   False         0
9   False         0
10  False         0
11   True         3
import numpy as np 
import pandas as pd 

steps = 
['false','false','true','true','true',
'false','true','true','false','false','true']
data =  pd.DataFrame({"steps":steps})

numbred =[] 
c = 0
for i in range(len(data.index)):
  if data['steps'][i] == 'false':
     numbred.append(0)
     if data['steps'][i+1] == 'true':
        c += 1
  else:
    numbred.append(c)

  data =  pd.DataFrame({"steps":steps,'numbred':numbred})
  print(data)

OUTPUT:输出:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM