简体   繁体   English

计算 csv 文件中出现的次数

[英]count number of occurrences in csv file

I want to be able to make a bar chart with the number of occurrences of a certain room (stored in a csv file).我希望能够制作一个包含某个房间出现次数的条形图(存储在 csv 文件中)。 the number of rooms is not defined in the beginning.房间的数量在开始时没有定义。

this is the data stored in the csv file:这是存储在 csv 文件中的数据:

csv数据

this is the type of graph i want to show, but with the code i currently have it doesn't show the number of occurrences.这是我想要显示的图表类型,但使用我目前拥有的代码,它不显示出现次数。

图形

what can i do to solve this problem?我能做些什么来解决这个问题?

this is my code so far.到目前为止,这是我的代码。

with open('calls.csv') as File:  
    plots = csv.reader(File, delimiter = ',')

   for row in plots:
        if (row[0] not in x):
            x.append(row[0])
       numberOfRooms = len(x)

    for i in range(numberOfRooms):
        occurence = 0
        for row in plots:
            if(x[i] == row[0]):
                occurence += 1
        y.append(occurence)

plt.bar(x,y, width = 0.7, label="Number of calls")
plt.xlabel('room')
plt.ylabel('number of calls')
plt.title('Number of calls per room')
plt.legend()
plt.show()

Since you tagged Pandas, you can do:由于您标记了 Pandas,您可以执行以下操作:

# read dataframe from csv
df = pd.read_csv('calls.csv')

# count the values in the first column
counts = df.iloc[:,0].value_counts()

# plot the count
counts.plot.bar()

That said, you can certainly use csv package, but you should probably use a different data structure, eg dictionary:也就是说,您当然可以使用csv package,但您可能应该使用不同的数据结构,例如字典:

with open('calls.csv') as File:  
    data = csv.reader(File, delimiter = ',')

    counts = {}
    
    # loop through data
    for row in data:
        room = row[0]

        # initiate if room is first seen
        if room not in counts: counts[room] = 0

        # increase the count
        counts[room] += 1
    
# plot the counts
plt.bar(list(counts.keys()), list(counts.values()) )

For a task like this, using only the "built-ins", I'd use a dictionary to accumulate the data对于这样的任务,仅使用“内置”,我会使用字典来积累数据

data = {}
with open('calls.csv') as File:  
    plots = csv.reader(File, delimiter = ',')
    for row in plots:
        # increments the counter of occurrences, 
        # if the key is not already in there, .get() will return the 0 as a default
        data[row[0]] = data.get(row[0],0) + 1
# at this point we have processed the whole file, so let's prepare data for plotting
# we take the keys of the dictionary and sort them for the x axis
x = list(data.keys())
x.sort()
# the we take the values, the count of occurrences, in the same order
y = [data[i] for i in x]
# then just ... plot it. 
plt.bar(x, y, width = 0.7, label="Number of calls")
plt.xlabel('room')
plt.ylabel('number of calls')
plt.title('Number of calls per room')
plt.legend()
plt.show()

Without pandas , you can save the number of calls per rooms in a dict and use this to plot:如果没有pandas ,您可以将每个房间的呼叫次数保存在dict中,并将其用于 plot:

import csv
import matplotlib.pyplot as plt

rooms = dict()
with open("calls.csv") as infile:
    csvreader = csv.reader(infile)
    
    for row in csvreader:
        if row[0] not in rooms:
            rooms[row[0]] = 0
        rooms[row[0]] += 1

plt.bar(rooms.keys(), rooms.values(), width=0.7)
plt.xlabel("room")
plt.ylabel("number of calls")
plt.title("Number of calls per room")
Output: Output:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM