简体   繁体   English

CSV文件的Python计数器

[英]Python counter for a CSV file

Im new to Python, and I need some help to get the results of a survey. 我是Python的新手,我需要一些帮助才能获得调查结果。 I have a CSV file, which looks like this: 我有一个CSV文件,看起来像这样:

Person, Gender, Q1, Q2, Q3
professor, male, agree, not agree, agree
professor, male, agree, agree, agree
professor, female, neutral, not agree, agree
Professor, female, agree, agree, agree
student, female, agree, not agree, not agree
student, female, no answer, not agree, agree
student, male, no answer, no answer, agree

I want to count the number of times the different answers occur per person and gender. 我想计算每个人和性别出现不同答案的次数。 For example Q1: (professor, male: agree, 2), (professor, female: agree 1; neutral 1) and so on. 例如,Q1 :(教授,男:同意,2),(教授,女:同意1;中性1),依此类推。 I have tried this so far: 到目前为止,我已经尝试过了:

import csv
from collections import Counter
with open('survey.csv') as csvfile:
    reader = csv.reader(csvfile, delimiter=',', dialect = csv.excel_tab)
    counts = Counter(map(tuple,reader))
    print [row for row in reader if row]
    print list(csv.reader(csvfile))

But I think because I have only strings, I do not get any result. 但是我认为因为只有字符串,所以没有任何结果。 Moreover, I still don't know how to get the data by people/gender. 而且,我仍然不知道如何按人/性别获取数据。 Thanks a lot in advance! 在此先多谢!

Using pandas you could do something like: 使用pandas您可以执行以下操作:

import pandas as pd
my_data = pd.read_csv('survey.csv')
# To summarize the dataframe for everything together:
print my_data.describe()
print my_data.sum()

# To group by gender, etc.
my_data.groupby('Gender').count()

If you don't want to switch to pandas, you need to do a bit of analysis on the rows after you read them. 如果您不想切换到熊猫,则需要在读取行后对它们进行一些分析。 Something like the following (untested). 类似于以下内容(未经测试)。 This uses Counter objects which behave a lot like ordinary dicts except that referring to a key that does not (yet) exist creates it automagically and gives it the value 0, rather than raising KeyError . 它使用的Counter对象的行为与普通dict的行为非常相似,不同之处在于,引用不存在(尚未存在)的键会自动创建该键并将其值设为0,而不是引发KeyError

from collections import Counter

counters = []
for row in reader:
   for colno,datum in enumerate(row):
       if colno >= len(counters): # do we have a counter for this column yet?
           counters.append( Counter() ) # if not, add another Counter
       counters[colno][datum] += 1

for counter in counters:
    print(counter)

If the first row of your csv file is some column headers, you can read it in advance and then use it to annotate the list of counters. 如果csv文件的第一行是某些列标题,则可以事先阅读它,然后用它来注释计数器列表。 I'll leave formatting the contents of the counters prettily to you as an exercise, should raw dumps of counter objects be deemed too ugly. 如果您认为计数器对象的原始转储过于丑陋,那么我将练习简化计数器内容的格式化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM