[英]using indices with multiple values, how to get the smallest one
I have an index to choose elements from one array. 我有一个索引可以从一个数组中选择元素。 But sometimes the index might have repeated entries... in that case I would like to choose the corresponding smaller value.
但是有时索引可能会重复输入...在这种情况下,我想选择相应的较小值。 Is it possible?
可能吗?
index = [0,3,5,5]
dist = [1,1,1,3]
arr = np.zeros(6)
arr[index] = dist
print arr
what I get: 我得到的是:
[ 1. 0. 0. 1. 0. 3.]
what I would like to get: 我想得到什么:
[ 1. 0. 0. 1. 0. 1.]
addendum 附录
Actually I have a third array with the (vector) values to be inserted. 实际上,我有第三个数组,其中要插入(向量)值。 So the problem is to insert values from
values
into arr
at positions index
as in the following. 因此,问题在于,将值的
values
插入到位置index
处的arr
中,如下所示。 However I want to choose the values corresponding to minimum dist
when multiple values have the same index. 不过,我想选择对应最低值
dist
当多个值具有相同的索引。
index = [0,3,5,5]
dist = [1,1,1,3]
values = np.arange(8).reshape(4,2)
arr = np.zeros((6,2))
arr[index] = values
print arr
I get: 我得到:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 6. 7.]]
I would like to get: 我想得到:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 4. 5.]]
Use groupby
in pandas: 在大熊猫中使用
groupby
:
import pandas as pd
index = [0,3,5,5]
dist = [1,1,1,3]
s = pd.Series(dist).groupby(index).min()
arr = np.zeros(6)
arr[s.index] = s.values
print arr
If index
is sorted, then itertools.groupby
could be used to group that list. 如果对
index
排序,则可以使用itertools.groupby
对列表进行分组。
np.array([(g[0],min([x[1] for x in g[1]])) for g in
itertools.groupby(zip(index,dist),lambda x:x[0])])
produces 产生
array([[0, 1],
[3, 1],
[5, 1]])
This is about 8x slower than the version using np.unique
. 这比使用
np.unique
的版本慢大约8倍。 So for N=1000
is similar to the Pandas version (I'm guessing since something is screwy with my Pandas import). 因此,对于
N=1000
它类似于Pandas的版本(我猜是因为我的Pandas导入有些麻烦)。 For larger N the Pandas version is better. 对于较大的N,Pandas版本更好。 Looks like the Pandas approach has a substantial startup cost, which limits its speed for small N.
似乎Pandas方法的启动成本很高,这限制了小N的运行速度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.