如何計算 Python Pandas 中逗號分隔列的出現次數

Question

如何計算整個列列表中逗號分隔值的出現次數

數據框是這樣的：

id column
1   
2   1
3   1
4   1,2
5   1,2
6   1,2,4
7   1,2,4
8   1,2,4,6
9   1,2,4,6
10  1,2,4,6,8
11  1,2,4,6,8

所需的 output 是：

id column count
1.         10
2   1.     7    
3   1.     0
4   1,2.   6
5   1,2.   0
6   1,2,4. 4  
7   1,2,4.  0
8   1,2,4,6. 2
9   1,2,4,6. 0
10  1,2,4,6,8 0
11  1,2,4,6,8 0

試過這個：

df = pd.read_csv('parentsplit/parentlist.csv')
df['count'] = df['parent_list'].str.split(',', expand=True).stack().value_counts()

它不工作。

Answer 1

您可以執行以下操作：

df['count'] = df['id'].apply(lambda x: df['column'].fillna('X').str.contains(str(x)).sum())

這基本上是在計算列中每個id的出現次數。

Output：

    id     column  count
0    1       None   10
1    2          1    8
2    3          1    0
3    4        1,2    6
4    5        1,2    0
5    6      1,2,4    4
6    7      1,2,4    0
7    8    1,2,4,6    2
8    9    1,2,4,6    0
9   10  1,2,4,6,8    0
10  11  1,2,4,6,8    0

Answer 2

拆分並分解列，然后使用value_counts count出現次數，然后 map 將計數添加到id列

s = df['column'].str.split(',').explode().value_counts()
df['count'] = df['id'].astype(str).map(s).fillna(0)

    id     column  count
0    1       None   10.0
1    2          1    8.0
2    3          1    0.0
3    4        1,2    6.0
4    5        1,2    0.0
5    6      1,2,4    4.0
6    7      1,2,4    0.0
7    8    1,2,4,6    2.0
8    9    1,2,4,6    0.0
9   10  1,2,4,6,8    0.0
10  11  1,2,4,6,8    0.0

Answer 3

一種快速的方法是不使用 pandas 方法，而是使用純 python： itertools.chain和collections.Counter ：

from itertools import chain
from collections import Counter
c = Counter(chain(*df['column'].str.split(',').values))
df['count'] = df['id'].astype(str).map(c)

output：

    id     column  count
0    1                10
1    2          1      8
2    3          1      0
3    4        1,2      6
4    5        1,2      0
5    6      1,2,4      4
6    7      1,2,4      0
7    8    1,2,4,6      2
8    9    1,2,4,6      0
9   10  1,2,4,6,8      0
10  11  1,2,4,6,8      0

如何計算 Python Pandas 中逗號分隔列的出現次數

問題描述

3 個解決方案

解決方案1
1 已采納 2021-12-04 09:30:58

解決方案2
0 2021-12-04 09:33:50

解決方案3
0 2021-12-04 09:50:08

如何計算 Python Pandas 中逗號分隔列的出現次數

問題描述

3 個解決方案

解決方案1 1 已采納 2021-12-04 09:30:58

解決方案2 0 2021-12-04 09:33:50

解決方案3 0 2021-12-04 09:50:08

解決方案1
1 已采納 2021-12-04 09:30:58

解決方案2
0 2021-12-04 09:33:50

解決方案3
0 2021-12-04 09:50:08