[英]Fill an array using the values of another array as the indices. If an index is repeated, prioritize according to a parallel array
Description描述
I have an array a
with N integer elements that range from 0 to M-1.我有一个数组
a
,其中包含 N 个 integer 个元素,范围从 0 到 M-1。 I have another array b
with N positive numbers.我有另一个数组
b
有 N 个正数。
Then, I want to create an array c
with M elements.然后,我想创建一个包含 M 个元素的数组
c
。 The i-th element of c
should the index of a
that has a value of i. c
的第 i 个元素应该是a
的索引,其值为 i。
b
.b
中具有更高值的那个。c
should be -1.c
的第 i 个元素应为 -1。 Example例子
N = 5, M = 3 N = 5,M = 3
a = [2, 1, 1, 2, 2]
b = [1, 3, 5, 7, 3]
Then, c should be...那么,c应该是...
c = [-1, 2, 3]
My Solution 1我的解决方案 1
A possible approach would be to initialize an array d
that stores the current max and then loop through a
and b
updating the maximums.一种可能的方法是初始化存储当前最大值的数组
d
,然后循环遍历a
和b
更新最大值。
c = -np.ones(M)
d = np.zeros(M)
for i, (idx, val) in enumerate(zip(a, b)):
if d[idx] <= val:
c[idx] = i
d[idx] = val
This solution is O(N) in time but requires iterating the array with Python, making it slow.此解决方案的时间复杂度为 O(N),但需要使用 Python 迭代数组,因此速度较慢。
My Solution 2我的解决方案 2
Another solution would be to sort a
using b
as the key.另一种解决方案是使用
b
作为键对a
进行排序。 Then, we can just assign a
indices to c
(max elements will be last).然后,我们可以将索引分配
a
c
(最大元素将排在最后)。
sort_idx = np.argsort(b)
a_idx = np.arange(len(a))
a = a[sort_idx]
a_idx = a_idx[sort_idx]
c = -np.ones(M)
c[a] = a_idx
This solution does not require Python loops but requires sorting b
, making it O(N*log(N)).此解决方案不需要 Python 循环,但需要对
b
进行排序,使其成为 O(N*log(N))。
Ideal Solution理想的解决方案
Is there a solution to this problem in linear time without having to loop the array in Python?有没有在线性时间内解决这个问题而不必在 Python 中循环数组的方法?
AFAIK, this cannot be implemented in O(n)
currently with Numpy (mainly because the index table is both read and written regarding the value of another array). AFAIK,这目前无法在
O(n)
中使用 Numpy 实现(主要是因为索引表是关于另一个数组的值读取和写入的)。 Note that np.argsort(b)
can theoretically be implemented in O(n)
using a radix sort, but such sort is not implemented yet in Numpy (it would not be much faster in practice due to the bad cache locality of the algorithm on big arrays).请注意,
np.argsort(b)
理论上可以使用基数排序在O(n)
中实现,但在 Numpy 中尚未实现这种排序(由于算法在大数组)。
One solution is to use Numba to speed up your algorithmically-efficient solution.一种解决方案是使用Numba来加速您的算法高效解决方案。 Numba uses a JIT compiler to speed up loops.
Numba 使用 JIT 编译器来加速循环。 Here is an example (working with
np.int32
types):这是一个示例(使用
np.int32
类型):
import numpy as np
import numba as nb
@nb.njit('int32[:](int32[:], int32[:])')
def compute(a, b):
c = np.full(M, -1, dtype=np.int32)
d = np.zeros(M, dtype=np.int32)
for i, (idx, val) in enumerate(zip(a, b)):
if d[idx] <= val:
c[idx] = i
d[idx] = val
return c
a = np.array([2, 1, 1, 2, 2], dtype=np.int32)
b = np.array([1, 3, 5, 7, 3], dtype=np.int32)
c = compute(a, b)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.