簡體   English   中英

Pandas\\Python:如何計算列中最后相同值的數量

[英]Pandas\Python: How to count the number of last identical values ​in a column

這是熊貓數據框。 “方向”列僅包含 3 個變量值:向下、平坦或向上。 只有最后一個相同的值才重要。 所以,問題在標題中。

     Time     Direction
id
0    16:59    Up
1    17:00    Flat
2    17:01    Up
3    17:02    Down
4    17:03    Down
5    17:04    Up
6    17:05    Up
7    17:06    Up

假設數據框名稱是熊貓。 結果必須是這樣的(而這個更喜歡):

result = 0
result = panda.tail(?)['Direction'].count_last_values(#as the most last value[Up <- in this case])[0]
print(result)
3

或者像這樣:

     Time     Direction     Series
id
0    16:59    Up            1
1    17:00    Flat          0
2    17:01    Up            1
3    17:02    Down          1
4    17:03    Down          2
5    17:04    Up            1
6    17:05    Up            2
7    17:06    Up            3

我自己可以做到這一點(但我想要更簡單的東西):

import pandas as pd

panda = pd.DataFrame({'Time':['16:59','17:00','17:01','17:02','17:03','17:04','17:05','17:06'], 'Direction':['Up','Flat','Up','Down','Down','Up','Up','Up']})

    Time    Direction
0   16:59   Up
1   17:00   Flat
2   17:01   Up
3   17:02   Down
4   17:03   Down
5   17:04   Up
6   17:05   Up
7   17:06   Up

tail = panda.tail(1)['Direction'].iloc[0]
counter = 0 
i = len(panda) - 1
if tail != 'Flat':
    while tail==panda.iloc[i]['Direction']:
        i -= 1
        counter += 1
print(counter)

3

檢查當前值是否與使用shift的前一個值相同,並使用cumsum()創建“組”。 使用.groupbycumcount創建新列。

s = (df['Direction'] != df['Direction'].shift()).cumsum()
df['Series'] = df.groupby(s).cumcount()+1

#output:
    Time    Direction   Series
id          
0   16:59   Up          1
1   17:00   Flat        1
2   17:01   Up          1
3   17:02   Down        1
4   17:03   Down        2
5   17:04   Up          1
6   17:05   Up          2
7   17:06   Up          3

如果在“方向.loc “平坦”時需要從零開始計數,請使用.loc

df.loc[df['Direction'] == 'Flat', 'Series'] = df['Series'].subtract(1)

#output
    Time    Direction   Series
id          
0   16:59   Up          1     
1   17:00   Flat        0
2   17:01   Up          1
3   17:02   Down        1
4   17:03   Down        2
5   17:04   Up          1
6   17:05   Up          2
7   17:06   Up          3
panda = pd.DataFrame({'Time':['16:59','17:00','17:01','17:02','17:03','17:04','17:05','17:06'], 'Direction':['Up','Flat','Up','Down','Down','Up','Up','Up']})
print(panda)
counter = 0
tail = panda.iloc[-1]['Direction']
for i in range(len(panda)-1,0,-1):
    if panda.iloc[i]['Direction'] == tail:
        counter += 1
    else:
        break
print(counter)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM