简体   繁体   中英

Pandas DataFrame count occurrences for each element after each appearance in column

Given a pandas dataframe like this one

pd.DataFrame(data={"codes": [1,1,1,0,0,1,1,0,0,1,2,2]})

time    codes
0       1
1       1
2       1
3       0
4       0
5       1
6       1
7       0
8       0
9       1
10      2
11      2

I would like to count how many times each element in codes occur after each new appearance. Note that I do not want to compute .value_counts() for each element. For example, for codes value 1 it appears 3 times, value 0 appears 2 times, and value 2 appears 1 time. The analogy of the task would be to count user sessions.

Expected output:

codes   count_occurences
1       3
0       2
2       1

With pandas you could do something like

df.codes.loc[df.codes!=df.codes.shift()].value_counts()

This will count values in codes only where the element is not equal to the previous one.

Pretty sure this can be achieved in plain Python.

myList = [1,2,6,2,2,4,3,3,4,4,6,1,1,2,3]
listSet = set(myList)
count = {}

for k in listSet:
    count[k]=0

for k in range(0, len(myList)-1):
    if myList[k]!=myList[k+1]:
        count[myList[k]]+=1
count[myList[k+1]]+=1
print(count)

This gives:

{1: 2, 2: 3, 3: 2, 4: 2, 6: 2}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM