简体   繁体   English

使用 Python 从 CSV 文件中查找每一列的平均值?

[英]Finding average of every column from CSV file using Python?

I have a CSV file, which has several columns and several rows.我有一个 CSV 文件,它有几列和几行。 Please, see the picture above.请看上图。 In the picture is shown just the two first baskets, but in the original CSV -file I have hundreds of them.图片中只显示了前两个篮子,但在原始 CSV 文件中,我有数百个。 [1]: https://i.stack.imgur.com/R2ZTo.png [1]: https://i.stack.imgur.com/R2ZTo.png

I would like to calculate average for every Fruit in every Basket using Python.我想使用 Python 计算每个篮子中每个水果的平均值。 Here is my code but it doesn't seem to work as it should be.这是我的代码,但它似乎无法正常工作。 Better ideas?更好的想法? I have tried to fix this also importing and using numpy but I didn't succeed with it.我也尝试过导入和使用 numpy 来解决这个问题,但我没有成功。

I would appreciate any help or suggestions.我将不胜感激任何帮助或建议。 I'm totally new in this.我对此完全陌生。

import csv
from operator import itemgetter


fileLineList = []
averageFruitsDict = {} # Creating an empty dictionary here.

with open('Fruits.csv', newline='') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        fileLineList.append(row)

for column in fileLineList:
    highest = 0
    lowest = 0
    total = 0
    average = 0
    for column in row:
        if column.isdigit():
            column = int(column)
            if column > highest:
                highest = column
            if column < lowest or lowest == 0:
                lowest = column
            total += column    
    average = total / 3
  
    averageFruitsDict[row[0]] = [highest, lowest, round(average)]

averageFruitsList = []


for key, value in averageFruitsDict.items():
    averageFruitsList.append([key, value[2]])


print('\nFruits in Baskets\n')
print(averageFruitsList)

--- So I'm know trying with this code: ---所以我知道尝试使用此代码:

import pandas as pd

fruits = pd.read_csv('fruits.csv', sep=';')
print(list(fruits.columns))
fruits['Unnamed: 0'].fillna(method='ffill', inplace = True)
fruits.groupby('Unnamed: 0').mean()
fruits.groupby('Bananas').mean()
fruits.groupby('Apples').mean()
fruits.groupby('Oranges').mean()
fruits.to_csv('results.csv', index=False)

It creates a new CSV file for me and it looks correct, I don't get any errors but I can't make it calculate the mean of every fruit for every basket.它为我创建了一个新的 CSV 文件,它看起来正确,我没有收到任何错误,但我无法让它计算每个篮子的每个水果的平均值。 Thankful of all help!感谢所有帮助!

So using the image you posted and replicating/creating an identical test csv called fruit - I was able to create this quick solution using pandas.因此,使用您发布的图像并复制/创建一个名为fruit的相同测试csv - 我能够使用pandas创建这个快速解决方案。

import pandas as pd
fruit = pd.read_csv('fruit.csv')

在此处输入图像描述

With the unnamed column containing the basket numbers with NaNs in between - we fill with the preceding value.未命名的列包含篮子编号,中间有 NaN - 我们用前面的值填充。 By doing so we are then able to group by the basket number (by using the 'Unnamed: 0' column and apply the mean to all other columns)通过这样做,我们就可以按篮子编号进行分组(通过使用“未命名:0”列并将平均值应用于所有其他列)

fruit['Unnamed: 0'].fillna(method='ffill', inplace = True)

fruit.groupby('Unnamed: 0').mean()

This gets you your desired output of a fruit average for each basket (please note I made up values for basket 3)这将为您提供您想要的 output 每个篮子的水果平均值(请注意我为篮子 3 编造了值)

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM