[英]How to find the frequency of list of words in a DataFrame using Pandas
我的df
看起來像這樣:
category text_list
-------- ---------
soccer [soccer, game, is, good, soccer, game]
basketball [game, basketball, game]
volleyball [sport ,volleyball, sport]
我想做的是按category
groupby
,然后按frequency
列出words
category text_list frequency
-------- --------- ---------
soccer soccer 2
game 2
is 1
good 1
basketball game 2
basketball 1
volleyball sport 2
volleyball 1
我做了什么?
frequency
,但我無法按照我在 DataFrame 中想要的方式DataFrame
有人可以幫我嗎? 如果可能的話,使用NLTK
嘗試explode
然后groupby
:
(df.explode('text_list')
.groupby(['category','text_list']).size()
.to_frame(name='frequency')
)
Output:
frequency
category text_list
basketball basketball 1
game 2
soccer game 2
good 1
is 1
soccer 2
volleyball sport 2
volleyball 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.