简体   繁体   English

Python:返回字典中两个值之间的平均值

[英]Python: returning the average between two values in a dictionary

I have this function: 我有这个功能:

def find_nearest(array,value):
    idx = (np.abs(array-value)).argmin()
    return array[idx]

def df_to_count_dict(df):

    count_dict = Counter(df.values)
    holder = []
    for i in range(1,max(list(count_dict.keys()))):
            if i in count_dict.keys(): continue
            holder.append(i)   

    for i in holder:
        j = find_nearest(np.array(list(count_dict.keys())),i)
        count_dict.update({i:count_dict[j]})

    return count_dict

What it does is it takes a data series and uses the Counter function from collection to return back a dictionary. 它的作用是需要一个数据系列并使用Collection函数中的Counter函数返回一个字典。 It also replaces values which are not in the dictionary with the closest value. 它还用最接近的值替换字典中不存在的值。

Now, I want to amend this function to return the same object, the count_dict but replace values not in the keys of the dictionary with the average between what it the missing value is between. 现在,我想修改此函数以返回相同的对象count_dict,但是替换不在字典的键中的值,以及缺失值之间的平均值。

This is best explained by an example: 最好用一个例子来解释:

Take 采取

test = pd.Series([1,2,3,3,7,7,7,8])

Without the function above we get: 没有上面的功能,我们得到:

Counter(test.values)
Out[459]: Counter({1: 1, 2: 1, 3: 2, 7: 3, 8: 1})

Using the function we get 使用我们得到的功能

df_to_count_dict(test)
Out[458]: Counter({1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2, 7: 3, 8: 1})

As you can see it has added keys 4,5,6 with values 2 as 2 is the value of the closest key (the closest key is 3). 正如您所看到的,它添加了键4,5,6,值为2,因为2是最近键的值(最近的键是3)。

What I have it to return is the AVERAGE between the value of lower closest key and the upper closest key, so the upper closest key is 3, which has value 2, and the upper closest key is 7, which has value 3, so I want the final product to look something like: 我要它返回的是较低最近键和最上面的键之间的平均值,因此最上面的键是3,其值为2,而最上面的键是7,其值为3,所以我希望最终产品看起来像:

df_to_count_dict(test)
Out[458]: Counter({1: 1, 2: 1, 3: 2, 4: 2.5, 5: 2.5, 6: 2.5, 7: 3, 8: 1})

I hope someone can help 我希望有人能帮帮忙

This look a lot like school work. 这看起来很像学校的工作。 So you should figure it out your self. 所以你应该把它弄清楚自己。 But here is a hint. 但这是一个提示。 The query you are being asked to develop is finding the mean between the predecessor's count and the successor's count. 您被要求开发的查询是查找前任计数和后继计数之间的平均值。 The predessor is the largest key smaller or equal to the input and the successor is the smallest key larger than the input. 前导码是小于或等于输入的最大键,后继键是大于输入的最小键。

If you need O(log(n))-complexity then you might look at binary search trees bintrees is a good package https://pypi.python.org/pypi/bintrees/2.0.4 . 如果您需要O(log(n)) - 复杂性,那么您可能会查看二进制搜索树bintrees是一个很好的包https://pypi.python.org/pypi/bintrees/2.0.4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM