I'm trying to output every word that appears in my tokens more than 1000 times (> 1000) and save it to freq1000.
freq1000 = []
newtokens = []
for words in tokens:
newtokens += words
FreqDist(newtokens)
fd_1 = FreqDist(newtokens)
for i in set(fd_1):
if fd_1.count(i) == >1000:
print(i)
This is my current code, I'm completly stuck after this and I'm not sure if there is a freqdist function I can use to help. I have saved the FreqDist to fd_1 successfully. I'm just unsure how to get an output of the words that appear more than 1000 times and save it to freq1000.
I would appreciate any help you can provide.
You can filter the words based on the frequency count using the freqDist.items()
like below:
list(filter(lambda x: x[1]>=1000, fd_1.items()))
Hope it helps :)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.