使用pycharm计算csv中特定单词的出现

Question

我在csv文件中有三列，我想遍历“标题”列并计算特定单词的出现次数，因此我从编码开始，但出现错误。 代码是：

import csv
import collections

Title = collections.Counter()
with open('Green Occupations.csv') as input_file:
    for row in csv.reader(input_file, delimiter=';'):
        Title[row[1]] += 1

print 'Number of word "..": %s' % Tiltle['wind']
print Title.most_common()

我得到这个错误：

Title[row[1]] += 1
IndexError: list index out of range

我有一个数据示例

+------------+---------------------------------+-------------------------+
|  SOC Code  |              Title              |  Occupational Category  |
+------------+---------------------------------+-------------------------+
| 11-1011.03 | Chief Sustainability Officers   | New & Emerging          |
| 11-1021.00 | General and Operations Managers | Enhanced Skills         |
+------------+---------------------------------+-------------------------+

任何想法？ :)

Answer 1

试试下面的代码

def get_count(title):
    count=0
    title=title.lower()
    f=open('Green Occupations.csv')
    l3=[[s.strip() for s in lines.split(',')] for lines in f.readlines()]
    l4=[item[x] for item in l3]
    for item in l4:
        if item.split(' ')[0].strip('"').lower()==title:
            count+=1
    return count
print(get_count('Industrial'))

x列及以上列表理解中的假设标题提供了标题列表

并且如果您在第3列中的标题将x替换为3。

occurence=get_count(title=)
# will return no of occurence starting with title

Answer 2

可以用熊猫吗？ 这将使工作非常容易：

import pandas as pd

#Import data from csv
df = pd.read_csv(input_file, delimiter=';')

search_word = 'Officer'  #example

# Check if each title contains the specified word and then count
counts = df['Title'].str.contains(search_word).sum()

使用pycharm计算csv中特定单词的出现

问题描述

2 个解决方案

解决方案1
0 2017-05-21 16:54:57

x列及以上列表理解中的假设标题提供了标题列表

解决方案2
0 2017-05-23 11:25:15

使用pycharm计算csv中特定单词的出现

问题描述

2 个解决方案

解决方案1 0 2017-05-21 16:54:57

x列及以上列表理解中的假设标题提供了标题列表

解决方案2 0 2017-05-23 11:25:15

解决方案1
0 2017-05-21 16:54:57

解决方案2
0 2017-05-23 11:25:15