計算 panadas dataframe 行內 Python 列表中元素的出現次數

Question

我正在嘗試計算每行列表中每個字符串的出現次數。

+----+---------------------------+
| Id |           Col1            |
+----+---------------------------+
| N1 | ['a', 'b', 'c', 'a']      |
| N2 | ['b', 'b', 'b']           |
| N3 | []                        |
| N4 | ['a', 'b', 'c', 'a', 'c'] | 
| N5 | []                        |
+----+---------------------------+

結果我想得到這樣的東西：

+----+---------------------------+-----------------------+
| Id |           Col1            |         Col2          |
+----+---------------------------+-----------------------+
| N1 | ['a', 'b', 'c', 'a']      | {'a':2, 'b':1, 'c':1} |
| N2 | ['b', 'b', 'b']           | {'b':3}               |
| N3 | []                        | {} or None            |
| N4 | ['a', 'b', 'c', 'a', 'c'] | {'a':2, 'b':1, 'c':2} |
| N5 | []                        | {} or None            |
+----+---------------------------+-----------------------+

已經嘗試通過不同的方法使用 DataFrame 內部 collections 庫中的計數器，但似乎沒有任何效果。

d = {'Id': ['N1', 'N2', 'N3', 'N4', 'N5'], 
     'Col1': [['a', 'b', 'c', 'a'], ['b', 'b', 'b'], [], ['a', 'b', 'c', 'a', 'c'], []]}
df = pd.DataFrame(data = d)

Answer 1

使用Counter檢查下面的代碼：

import pandas as pd 

from collections import Counter

df['Col2'] = df.apply(lambda x: Counter(x['Col1']) ,axis=1)

df

Output：

Answer 2

單線怎么辦：

df['Col2] = df['Col1'].apply(lambda x: pd.Series(x).value_counts().to_dict())

輸出/輸出：

   Id             Col1                      Col2
0  N1     [a, b, c, a]  {'a': 2, 'b': 1, 'c': 1}
1  N2        [b, b, b]                  {'b': 3}
2  N3               []                        {}
3  N4  [a, b, c, a, c]  {'a': 2, 'c': 2, 'b': 1}
4  N5               []                        {}

Answer 3

很簡單：

from collections import Counter

df['col_2'] = df.Col1.map(Counter)

>>> df
'''
   Id             Col1                     col_2
0  N1     [a, b, c, a]  {'a': 2, 'b': 1, 'c': 1}
1  N2        [b, b, b]                  {'b': 3}
2  N3               []                        {}
3  N4  [a, b, c, a, c]  {'a': 2, 'b': 1, 'c': 2}
4  N5               []                        {}

Answer 4

from numpy import nan
import pandas as pd

解決方案：

col = []
for i, row in df.iterrows():
    col.append(
        {elem : row['Col1'].count(elem) for elem in set(row['Col1'])} # set removes duplicates
    )

df = df.join(pd.Series(col, name='Col2'))

替代解決方案：

def add_value_counts(col):
    col['Col2'] = pd.Series(col['Col1']).value_counts().to_dict()
    return col

df = df.T.apply(add_value_counts, axis=0).T # transposes df and iterates over rows

計算 panadas dataframe 行內 Python 列表中元素的出現次數

問題描述

4 個解決方案

解決方案1
2 2022-08-04 12:18:33

解決方案2
2 2022-08-04 12:19:45

解決方案3
2 已采納 2022-08-04 13:00:34

解決方案4
1 2022-08-04 12:03:17

解決方案：

替代解決方案：

計算 panadas dataframe 行內 Python 列表中元素的出現次數

問題描述

4 個解決方案

解決方案1 2 2022-08-04 12:18:33

解決方案2 2 2022-08-04 12:19:45

解決方案3 2 已采納 2022-08-04 13:00:34

解決方案4 1 2022-08-04 12:03:17

解決方案：

替代解決方案：

解決方案1
2 2022-08-04 12:18:33

解決方案2
2 2022-08-04 12:19:45

解決方案3
2 已采納 2022-08-04 13:00:34

解決方案4
1 2022-08-04 12:03:17