簡體   English   中英

計算特定范圍內的出現次數

[英]Count occurrences within a specific range

我有一個看起來像這樣的數據框:

               Tag
0           skip_1
1              run
2           skip_1
3              run
4           skip_1
5              run
6           skip_2
7              run
8           skip_1
9              run
10          skip_2
11            jump
12          skip_1
13             run
14          skip_2
15            jump
16          skip_1
17             run
18          skip_2
19    cleanup_jump
20          skip_1
21             run
22          skip_2
23             run
24          skip_2
25            jump
26          skip_1
27             run
28          skip_2
29            jump

首先,我想計算兩個 JUMP 事件之間的 RUN 發生次數,然后在此范圍內從最近到最早枚舉此事件。 預期結果將是:

             Tag  Jump_Run_Count  Run_Order
0         skip_1               0          0
1            run               0          5
2         skip_1               0          0
3            run               0          4
4         skip_1               0          0
5            run               0          3
6         skip_2               0          0
7            run               0          2
8         skip_1               0          0
9            run               0          1
10        skip_2               0          0
11          jump               5          0
12        skip_1               0          0
13           run               0          1
14        skip_2               0          0
15          jump               1          0
16        skip_1               0          0
17           run               0          0
18        skip_2               0          0
19  cleanup_jump               0          0
20        skip_1               0          0
21           run               0          2
22        skip_2               0          0
23           run               0          1
24        skip_2               0          0
25          jump               2          0
26        skip_1               0          0
27           run               0          1
28        skip_2               0          0
29          jump               1          0

這里的一個問題是第一個 RUN 出現不在 2 JUMP 內,而是在第一個 JUMP 和列的開頭之間。

其次,我想對 CLEANUP_JUMP 和 JUMP 范圍進行相同的計數和枚舉,並將其存儲在單獨的列中。

             Tag  Jump_Run_Count  Run_Order  Cleanup_Jump_Dig_Count  Run_Order2
0         skip_1               0          0                       0           0
1            run               0          5                       0           0
2         skip_1               0          0                       0           0
3            run               0          4                       0           0
4         skip_1               0          0                       0           0
5            run               0          3                       0           0
6         skip_2               0          0                       0           0
7            run               0          2                       0           0
8         skip_1               0          0                       0           0
9            run               0          1                       0           0
10        skip_2               0          0                       0           0
11          jump               5          0                       0           0
12        skip_1               0          0                       0           0
13           run               0          1                       0           0
14        skip_2               0          0                       0           0
15          jump               1          0                       0           0
16        skip_1               0          0                       0           0
17           run               0          0                       0           1
18        skip_2               0          0                       0           0
19  cleanup_jump               0          0                       1           0
20        skip_1               0          0                       0           0
21           run               0          2                       0           0
22        skip_2               0          0                       0           0
23           run               0          1                       0           0
24        skip_2               0          0                       0           0
25          jump               2          0                       0           0
26        skip_1               0          0                       0           0
27           run               0          1                       0           0
28        skip_2               0          0                       0           0
29          jump               1          0                       0           0

我添加了一些可能更好地解釋它的圖片:

場景一

場景2

任何有關如何對此進行編碼的幫助,或者甚至是解決此問題的其他方法的任何幫助都將受到高度贊賞。

謝謝!

這是使用 pandas 的解決方案:

import pandas as pd
import numpy as np

df['run'] = df['Tag'] == 'run'
val_mask = df['Tag'].replace({'cleanup_jump':'jump'}) == 'jump'
df['tag_id'] = (val_mask).cumsum()
df.loc[val_mask, 'Jump_Count'] = df.groupby('tag_id')['run'].sum().to_numpy()[:-1]
df.loc[df['run'], 'run_per_jump'] = df.loc[df['run']].groupby('tag_id')['run'].cumsum()
df['Jump_Run_Order'] = df.groupby('tag_id')['run_per_jump'].rank(method='dense', ascending=False)

jumps_idx = np.flatnonzero(df['Tag'] == 'jump')
cj_idxs = np.flatnonzero(df['Tag'] == 'cleanup_jump')
cj_help_idxs = np.asarray([np.max(jumps_idx[jumps_idx < cj_idx]) for cj_idx in cj_idxs])

for start, end in zip(cj_help_idxs+1, cj_idxs):
    df.loc[start:end, 'Cleanup_Jump_Count'] = df.loc[start:end, 'Jump_Count']
    df.loc[start:end, 'Cleanup_Jump_Run_Order'] = df.loc[start:end, 'Jump_Run_Order']
    df.loc[start:end, 'Jump_Run_Order'] = 0
    df.loc[start:end, 'Jump_Count'] = 0

df = df.drop(columns=['tag_id', 'run', 'run_per_jump']).fillna(0).convert_dtypes(convert_integer=True)

print(df)
             Tag  Jump_Count  Jump_Run_Order  Cleanup_Jump_Run_Order  Cleanup_Jump_Count
0         skip_1           0               0                       0                   0
1            run           0               5                       0                   0
2         skip_1           0               0                       0                   0
3            run           0               4                       0                   0
4         skip_1           0               0                       0                   0
5            run           0               3                       0                   0
6         skip_2           0               0                       0                   0
7            run           0               2                       0                   0
8         skip_1           0               0                       0                   0
9            run           0               1                       0                   0
10        skip_2           0               0                       0                   0
11          jump           5               0                       0                   0
12        skip_1           0               0                       0                   0
13           run           0               1                       0                   0
14        skip_2           0               0                       0                   0
15          jump           1               0                       0                   0
16        skip_1           0               0                       0                   0
17           run           0               0                       1                   0
18        skip_2           0               0                       0                   0
19  cleanup_jump           0               0                       0                   1
20        skip_1           0               0                       0                   0
21           run           0               2                       0                   0
22        skip_2           0               0                       0                   0
23           run           0               1                       0                   0
24        skip_2           0               0                       0                   0
25          jump           2               0                       0                   0
26        skip_1           0               0                       0                   0
27           run           0               1                       0                   0
28        skip_2           0               0                       0                   0
29          jump           1               0                       0                   0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM