[英]counting occurrence of specific word in csv using pycharm
我在csv文件中有三列,我想遍历“标题”列并计算特定单词的出现次数,因此我从编码开始,但出现错误。 代码是:
import csv
import collections
Title = collections.Counter()
with open('Green Occupations.csv') as input_file:
for row in csv.reader(input_file, delimiter=';'):
Title[row[1]] += 1
print 'Number of word "..": %s' % Tiltle['wind']
print Title.most_common()
我得到这个错误:
Title[row[1]] += 1
IndexError: list index out of range
我有一个数据示例
+------------+---------------------------------+-------------------------+
| SOC Code | Title | Occupational Category |
+------------+---------------------------------+-------------------------+
| 11-1011.03 | Chief Sustainability Officers | New & Emerging |
| 11-1021.00 | General and Operations Managers | Enhanced Skills |
+------------+---------------------------------+-------------------------+
任何想法 ? :)
试试下面的代码
def get_count(title):
count=0
title=title.lower()
f=open('Green Occupations.csv')
l3=[[s.strip() for s in lines.split(',')] for lines in f.readlines()]
l4=[item[x] for item in l3]
for item in l4:
if item.split(' ')[0].strip('"').lower()==title:
count+=1
return count
print(get_count('Industrial'))
并且如果您在第3列中的标题将x替换为3。
occurence=get_count(title=)
# will return no of occurence starting with title
可以用熊猫吗? 这将使工作非常容易:
import pandas as pd
#Import data from csv
df = pd.read_csv(input_file, delimiter=';')
search_word = 'Officer' #example
# Check if each title contains the specified word and then count
counts = df['Title'].str.contains(search_word).sum()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.