简体   繁体   中英

Is there a short way to count dictionaries values in pandas column?

I'm trying to count the number of times that the values in a dictionary appear in a column of dataframe that contains a stemmed text.

I made a list with the values and then applied in the counter function to count every single value in each row

dictionary = {'c-1' : ['x', 'y', 'z'], 'c-2' : ['a', 'b']}

words_list = list()
for key in dictionary.keys():
    words_list.append(dictionary[key])
test = [val for sublist in words_list for val in sublist]

from collections import Counter
text['Counter'] = text['Text'].apply(lambda x: Counter([word for word in x if word in test]))

text = {'text': ['some text', some text'], 'Counter': [Counter({a = 1, x = 2}), Counter({b = 2, y = 4, z = 3})]}

I'd like to show a column with the result for each row. Maybe I chose a large way to do that. I think that is a direct way to working directly in the dictionary, but I don't know exactly how.

IIUC, use collections.Counter with itertools.chain :

from itertools import chain
from collections import Counter

d = {'c-1' : ['x', 'y', 'z'], 'c-2' : ['a', 'b']}
s = pd.Series(['abc', 'xyz', 'abda'])
new_s = s.str.findall('|'.join(chain(*d.values()))).apply(Counter)
print(new_s)

Output:

0            {'b': 1, 'a': 1}
1    {'z': 1, 'x': 1, 'y': 1}
2            {'b': 1, 'a': 2}
dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM