[英]Count occurrences of elements in Python list inside panadas dataframe rows
我正在嘗試計算每行列表中每個字符串的出現次數。
+----+---------------------------+
| Id | Col1 |
+----+---------------------------+
| N1 | ['a', 'b', 'c', 'a'] |
| N2 | ['b', 'b', 'b'] |
| N3 | [] |
| N4 | ['a', 'b', 'c', 'a', 'c'] |
| N5 | [] |
+----+---------------------------+
結果我想得到這樣的東西:
+----+---------------------------+-----------------------+
| Id | Col1 | Col2 |
+----+---------------------------+-----------------------+
| N1 | ['a', 'b', 'c', 'a'] | {'a':2, 'b':1, 'c':1} |
| N2 | ['b', 'b', 'b'] | {'b':3} |
| N3 | [] | {} or None |
| N4 | ['a', 'b', 'c', 'a', 'c'] | {'a':2, 'b':1, 'c':2} |
| N5 | [] | {} or None |
+----+---------------------------+-----------------------+
已經嘗試通過不同的方法使用 DataFrame 內部 collections 庫中的計數器,但似乎沒有任何效果。
d = {'Id': ['N1', 'N2', 'N3', 'N4', 'N5'],
'Col1': [['a', 'b', 'c', 'a'], ['b', 'b', 'b'], [], ['a', 'b', 'c', 'a', 'c'], []]}
df = pd.DataFrame(data = d)
單線怎么辦:
df['Col2] = df['Col1'].apply(lambda x: pd.Series(x).value_counts().to_dict())
輸出/輸出:
Id Col1 Col2
0 N1 [a, b, c, a] {'a': 2, 'b': 1, 'c': 1}
1 N2 [b, b, b] {'b': 3}
2 N3 [] {}
3 N4 [a, b, c, a, c] {'a': 2, 'c': 2, 'b': 1}
4 N5 [] {}
很簡單:
from collections import Counter
df['col_2'] = df.Col1.map(Counter)
>>> df
'''
Id Col1 col_2
0 N1 [a, b, c, a] {'a': 2, 'b': 1, 'c': 1}
1 N2 [b, b, b] {'b': 3}
2 N3 [] {}
3 N4 [a, b, c, a, c] {'a': 2, 'b': 1, 'c': 2}
4 N5 [] {}
from numpy import nan
import pandas as pd
col = []
for i, row in df.iterrows():
col.append(
{elem : row['Col1'].count(elem) for elem in set(row['Col1'])} # set removes duplicates
)
df = df.join(pd.Series(col, name='Col2'))
def add_value_counts(col):
col['Col2'] = pd.Series(col['Col1']).value_counts().to_dict()
return col
df = df.T.apply(add_value_counts, axis=0).T # transposes df and iterates over rows
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.