简体   繁体   English

我如何创建一个函数来查找高于中位数的平均值?

[英]How do I create a function to find average above median?

I want to find the average of all scores above and below the median (not including the median), but I have no idea have to go about doing this. 我想找到高于和低于中位数(不包括中位数)的所有分数的平均值,但是我不知道必须这样做。

import collections

def main():
    names = ["gymnastics_school", "participant_name", "all_around_points_earned"]
    Data = collections.namedtuple("Data", names)
    data = []
    values =[]

    with open('state_meet.txt','r') as f:   
        for line in f:
            line = line.strip()
            items = line.split(',')
            items[2] = float(items[2])
            data.append(Data(*items))        
            values.append(items[2])
    print("summary of data:")



    sorted_data = sorted (values)
    if len(data)%2==0:
        a =sorted_data[len(values)//2]

        b = sorted_data[len(values)//2-1]
        median_val = (a+b)//2
    else:
        median_val = sorted_data[(len(values)-1)//2]

    print("   median score",median_val)   #median

We now have statistics as part of the standard library: 现在, statistics成为标准库的一部分:

import statistics

nums = list(range(10))
med = statistics.median(nums)

hi_avg = statistics.mean(i for i in nums if i > med)
lo_avg = statistics.mean(i for i in nums if i < med)

You can use the build-in function filter and sum . 您可以使用内置函数filtersum For example 例如

above_med = filter(lambda x: x>median_val,  values)
print(" average of scores above median ", sum(above_med)/len(above_med))

Edited: 编辑:

As suggested by @ChrisP, you can also use the standard package statistics introduced since python 3.4. 如@ChrisP所建议,您还可以使用从python 3.4开始引入的标准软件包statistics

Here is an example: 这是一个例子:

import numpy as np

data_array = np.array(data)

med = np.median(data)

ave_above_med = data_array[data_array > med].mean()

so the function would be: 因此该函数将是:

import numpy as np

def average_above_med(data):
    data_array = np.array(data)
    med = np.median(data)
    response = data_array[data_array > med].mean()

    return response

This is a test of that: 这是对它的测试:

test_data = [1, 5, 66, 7, 5]
print(average_above_med(test_data))

which displays: 显示:

36.5

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM