繁体   English   中英

使用python计算列中单词的出现次数

[英]Count of occurence of a word in the column using python

这是我的文本文件的外观

000000005|19670905|M|20060201|20070131|6709055223085|01|PRINCIPLE|000021629633|ONYX
000000005|19740423|F|20060201|20070131|7404230424084|01|WIFE|000021629633|ONYX
000000005|19991028|F|20060201|20070131|9910280147084|01|DAUGHTER|000021629633|ONYX

我要遇到这个词PRINCIPLE ,然后提供的计数WIFEDAUGHTER在这里的计数WIFE是1和DAUGHTER也是1这些列和行有头有喜欢的多个条目000004000008

counts = data['gender'].value_counts().to_dict()

我这样做是为了获得男性和女性的人数。 我只是在尝试。 我需要一些有关如何使用python解决此问题的帮助

我想要类似的东西

PRINCIPLE WIFE DAUGHTER
and below the counts

你可以试试这个

 import pandas as pd 

# select the gender col 
gender = data[['gender']] 
# groupby to a new dataframe
counts = pd.DataFrame({'count' : gender.groupby(['gender']).size()}).reset_index()

如果要添加第一个列“条目”

gender = data[['gender','entries']] 
# groupby to a new dataframe
counts = pd.DataFrame({'count' : gender.groupby(['entries','gender']).size()}).reset_index()

范例:

>>> print(d)
   entries     gender
0        5  PRINCIPLE
1        5       WIFE
2        5   DAUGHTER
3        6  PRINCIPLE
4        6  PRINCIPLE
5        6   DAUGHTER
6        7       WIFE
7        7   DAUGHTER
8        7       WIFE

>>> count = pd.DataFrame({'count' : d.groupby(['entries','gender']).size()}).reset_index()

>>> print(count)
   entries     gender  count
0        5   DAUGHTER      1
1        5  PRINCIPLE      1
2        5       WIFE      1
3        6   DAUGHTER      1
4        6  PRINCIPLE      2
5        7   DAUGHTER      1
6        7       WIFE      2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM