简体   繁体   English

numpy数组映射并取平均值

[英]numpy array mapping and take average

I have three arrays 我有三个阵列

import numpy as np
value = np.array ([1, 3, 3, 5, 5, 7, 3])
index = np.array ([1, 1, 3, 3, 6, 6, 6])
data  = np.array ([1, 2, 3, 4, 5, 6])

Arrays "index" & "value" have same size and I want to group the items in "value" by taking average. 数组“索引”和“值”具有相同的大小,我想通过取平均值将项目分组为“值”。 For example: For the first two items [1, 3, ... in "value", have the same key 1 in "index", so for the final array the value is the mean of the 1st & 2rd items in value : (1 + 3 )/2 which is equal 2 例如:对于前两项[1,3,... in“value”,在“index”中具有相同的键1,因此对于最终数组,该值是值中第1和第2项的平均值: (1 + 3)/ 2等于2

The final array is: 最后一个数组是:

[2, nan, 4, nan, nan, 5]

first value is the average of 1st and 2nd of "value" 第一个值是“值”的第1和第2的平均值
second value is nan because there is not any key in "index" (no "2" in array index) 第二个值是nan,因为“index”中没有任何键(数组索引中没有“2”)
third value is the average of 3rd and 4th of "value" ... 第三个值是“价值”的第3和第4的平均值......

Thanks for your help!!! 谢谢你的帮助!!!

Regards, Roy 问候,罗伊

>>> [value[index==i].mean() for i in data]
[2.0, nan, 4.0, nan, nan, 5.0]

Maybe you would like to use numpy.bincount() ? 也许你想使用numpy.bincount()

value = np.array([1, 3, 3, 5, 5, 7, 3])
index = np.array([1, 1, 3, 3, 6, 6, 6])
np.bincount(index, value) / np.bincount(index)
# array([ NaN,   2.,  NaN,   4.,  NaN,  NaN,   5.])

Is this the general idea you are looking for? 这是您正在寻找的一般想法吗?

import numpy as np
value = np.array ([1, 3, 3, 5, 5, 7, 3])
index = np.array ([1, 1, 3, 3, 6, 6, 6])
data  = np.array ([1, 2, 3, 4, 5, 6])

answer = np.array(data, dtype=float)
for i, e in enumerate(data):
    idx = np.where(index==e)[0]
    val = value[idx]
    answer[i] = np.mean(val)

print answer # [  2.  nan   4.  nan  nan   5.]

If your data array is very large, there may be better solutions. 如果您的data阵列非常大,可能会有更好的解决方案。

I have searched for use numpy histogram to solve the huge array: 我搜索了使用numpy直方图来解决巨大的数组:

value = np.array ([1, 3, 3, 5, 5, 7, 3], dtype='float') value = np.array([1,3,3,5,5,7,3],dtype ='float')
index = np.array ([1, 1, 3, 3, 6, 6, 6], dtype='float') index = np.array([1,1,3,3,6,6,6],dtype ='float')
data = np.array ([1, 2, 3, 4, 5, 6]) data = np.array([1,2,3,4,5,6])

sums = np.histogram(index , bins=np.arange(index.min(), index.max()+2), weights=value)[0] sums = np.histogram(index,bins = np.arange(index.min(),index.max()+ 2),weights = value)[0]
counter = np.histogram(index , bins=np.arange(index.min(), index.max()+2))[0] counter = np.histogram(index,bins = np.arange(index.min(),index.max()+ 2))[0]

sums / counter 总和/柜台

array([ 2., NaN, 4., NaN, NaN, 5.]) 阵列([2.,NaN,4.,NaN,NaN,5。])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM