[英]Numpy concatenate + merge 1D arrays
我需要连接数组,但如果它们重叠,还要将A的结尾与B的开头合并。
[1, 2, 4] + [2, 4, 5] -> [1, 2, 4, 5]
[1, 2, 4] + [2, 5, 4] -> [1, 2, 4, 2, 5, 4]
[1, 2, 4] + [1, 2, 4, 5] -> [1, 2, 4, 5]
注意:必须保留元素的顺序,[4,5]与[5,4]不同。
注2:这个问题也可以这样理解:我们需要A的尽可能短的扩展,使输出以B结尾。
当然我可以迭代第二个数组并比较逐个元素,但我正在寻找一个不错的Numpy解决方案。
最初误解了这个问题。 问题来自我的理解:
Two item suffix of A matches 2 item prefix of B:
[1, 2, 4] +
[2, 4, 5] =>
[1, 2, 4, 5]
No suffix of A matches a prefix of B:
[1, 2, 4] +
[2, 5, 4] ->
[1, 2, 4, 2, 5, 4]
然后我们可以使用这个非常低效的函数:
def merge(A,B):
i = 0
m = 0
# Find largest suffix of A that matches the prefix of B with the same length
while i <= len(A):
if A[-i:] == B[:i] and i > m:
m = i
i += 1
return A + B[m:]
以下是使用NumPy的解决方案。 它并不理想,因为它需要(可能不需要的)排序和迭代。 排序和迭代都应该在一个相对较小的数组(甚至是一个单独的元素)上。
import numpy as np
def merge(left, right):
"""Concatenating two arrays, merging the overlapping end and start of
the left and right array"""
# We can limit the search to the maximum possible overlap between
# the arrays, which is the minimum of the two lengths
l = min(len(left), len(right))
# Find all indices in `right` where the element matches the last element of `left`.
# Need to sort, since the `nonzero` documentation doesn't
# explicitly state whether the returned indices follow the order
# as in `right`
# As long as there are few matches, sorting will not be a showstopper
# Need to reverse the sorted array, to start from the back of the
# right array, work towards the front, until there is a proper match
for i in np.sort(np.nonzero(right[:l] == left[-1])[0])[::-1]:
# Check if the subarrays are equal
if np.all(left[-i-1:] == right[:i+1]):
return np.concatenate([left, right[i+1:]])
# No match
return np.concatenate([left, right])
a = np.array([1, 2, 4])
b = np.array([2, 4, 5])
c = np.array([2, 5, 4])
d = np.array([1, 2, 4, 5])
e = np.array([1, 2, 4, 2])
f = np.array([2, 4, 2, 5])
print(merge(a, b))
print(merge(a, c))
print(merge(a, d))
print(merge(e, b))
print(merge(e, f))
产量
[1 2 4 5]
[1 2 4 2 5 4]
[1 2 4 5]
[1 2 4 2 4 5]
[1 2 4 2 5]
你可以这样做。
def concatenate(a,b):
ret = a.copy()
for element in b:
if not element in ret:
ret.append(element)
return ret
这使订单保持在+ b形式。
我有一个O(n)
解决方案,虽然没有Numpy:
def merge(a, b):
n_a = len(a)
n = min(n_a, len(b))
m = 0
for i in range(1, n + 1):
if b[n - i] == a[n_a - 1 - m]:
m += 1
else:
m = 0
return a + b[m:]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.