简体   繁体   English

使用简单的代码获取csv文件中整个列的平均值(在Python中)

[英]Using simple code to get the average (in Python) of an entire column in a csv file

I've seen similar questions, but never one that gives a simple straightforward pythonic answer. 我见过类似的问题,但是从来没有人给出过简单直接的pythonic答案。

I'm simply trying to get the average for the "high" column in a csv file. 我只是想获取csv文件中“高”列的平均值。

import csv
import numpy as np    


with open('2010-Jan-June.csv', 'r', encoding='utf8', newline='') as f:
    highs = []
    for row in csv.DictReader(f, delimiter=','):
        high = int(row['high'])
print(sum(highs)/len(highs))

My csv looks like this: 我的csv看起来像这样:

date,high,low,precip
1-Jan,43,41,0
2-Jan,50,25,0
3-Jan,51,25,0
4-Jan,44,25,0
5-Jan,36,21,0
6-Jan,39,20,0
7-Jan,47,21,0.04
8-Jan,30,14,0
9-Jan,30,12,0

Using Pandas: 使用熊猫:

import pandas as pd

avg = pd.read_csv(r'/path/to/2010-Jan-June.csv', usecols=['high'], squeeze=True).mean()

Note, this is totally possible using plain Python: 请注意,使用纯Python完全可以实现:

import csv
import statistics as stats

with open('2010-Jan-June.csv') as f:
    avg = stats.mean(row['high'] for row in csv.DictReader(f, delimiter=','))

print(avg)

Since you imported numpy you can use that - almost as easily as pandas : 由于您导入了numpy您可以像使用pandas一样轻松地使用它:

Reading from a paste copy of your sample: 从样本的粘贴副本中读取:

In [36]: txt="""date,high,low,precip
    ...: 1-Jan,43,41,0
    ...: 2-Jan,50,25,0
    ...: 3-Jan,51,25,0
    ...: 4-Jan,44,25,0
    ...: 5-Jan,36,21,0
    ...: 6-Jan,39,20,0
    ...: 7-Jan,47,21,0.04
    ...: 8-Jan,30,14,0
    ...: 9-Jan,30,12,0"""

Python3 with numpy 1.14 likes to have the encoding parameter: numpy 1.14的Python3喜欢使用encoding参数:

In [38]: data = np.genfromtxt(txt.splitlines(),delimiter=',',dtype=None,names=True,
    ...: encoding=None)
In [39]: data
Out[39]: 
array([('1-Jan', 43, 41, 0.  ), ('2-Jan', 50, 25, 0.  ),
       ('3-Jan', 51, 25, 0.  ), ('4-Jan', 44, 25, 0.  ),
       ('5-Jan', 36, 21, 0.  ), ('6-Jan', 39, 20, 0.  ),
       ('7-Jan', 47, 21, 0.04), ('8-Jan', 30, 14, 0.  ),
       ('9-Jan', 30, 12, 0.  )],
      dtype=[('date', '<U5'), ('high', '<i8'), ('low', '<i8'), ('precip', '<f8')])

The result is a structured array, from which it is easy to pick the high field: 结果是一个结构化的数组,从中可以轻松选择high场:

In [40]: data['high']
Out[40]: array([43, 50, 51, 44, 36, 39, 47, 30, 30])
In [41]: data['high'].mean()
Out[41]: 41.111111111111114

Or in one line, loading just one column: 或者一行,只加载一列:

In [44]: np.genfromtxt(txt.splitlines(),delimiter=',',skip_header=1,usecols=[1]).mean()
Out[44]: 41.111111111111114

Here is my attempt at a pythonic answer using just csv library... 这是我尝试使用csv库的pythonic答案...

import csv
with open ('names.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    print sum(float(d['high']) for d in reader) / (reader.line_num - 1)

Will have a divide by 0 if there are no lines in the file. 如果文件中没有行,则除以0。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Python 从 CSV 文件中查找每一列的平均值? - Finding average of every column from CSV file using Python? Python - 计算 csv 文件中每一列的平均值 - Python - Calculate average for every column in a csv file Python多文件csv sum列,一周的平均值和分支的平均值 - Python multiple file csv sum column, average for the week and average for the branch 如何在python 3中求和一个简单的csv文件列? - How to sum a simple csv file column in python 3? 将值从一个csv文件匹配到另一个,并使用pandas / python替换整个列 - Matching values from one csv file to another and replace entire column using pandas/python 使用python(最好是借助pandas)读取CSV文件的每个整列 - Read each entire Column of CSV file using python (preferably by help of pandas ) 如何在python中找到csv文件的平均列? - How do I find the average of a column of a csv file in python? Python:CSV 文件中基于另一列值的平均值 - Python: Average values in a CSV file based on value of another column 而不是在csv文件中丢失值,而是在该列中写入平均值(在python中) - Instead of missing values in the csv file, write the average of the values in that column(in python) Python - 查找csv文件中每列的平均值,不包括标题和时间 - Python - Find the average for each column in a csv file excluding headers and time
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM