繁体   English   中英

Numpy连接+合并1D数组

[英]Numpy concatenate + merge 1D arrays

我需要连接数组,但如果它们重叠,还要将A的结尾与B的开头合并。

[1, 2, 4] + [2, 4, 5] -> [1, 2, 4, 5]
[1, 2, 4] + [2, 5, 4] -> [1, 2, 4, 2, 5, 4]
[1, 2, 4] + [1, 2, 4, 5] -> [1, 2, 4, 5]

注意:必须保留元素的顺序,[4,5]与[5,4]不同。

注2:这个问题也可以这样理解:我们需要A的尽可能短的扩展,使输出以B结尾。

当然我可以迭代第二个数组并比较逐个元素,但我正在寻找一个不错的Numpy解决方案。

最初误解了这个问题。 问题来自我的理解:

Two item suffix of A matches 2 item prefix of B:
[1, 2, 4] +
   [2, 4, 5] =>
[1, 2, 4, 5]

No suffix of A matches a prefix of B:
[1, 2, 4] + 
         [2, 5, 4] -> 
[1, 2, 4, 2, 5, 4]

然后我们可以使用这个非常低效的函数:

def merge(A,B):
    i = 0
    m = 0
    # Find largest suffix of A that matches the prefix of B with the same length
    while i <= len(A):
        if A[-i:] == B[:i] and i > m:
            m = i
        i += 1
    return A + B[m:]

以下是使用NumPy的解决方案。 它并不理想,因为它需要(可能不需要的)排序和迭代。 排序和迭代都应该在一个相对较小的数组(甚至是一个单独的元素)上。

import numpy as np

def merge(left, right):
    """Concatenating two arrays, merging the overlapping end and start of
    the left and right array"""

    # We can limit the search to the maximum possible overlap between
    # the arrays, which is the minimum of the two lengths
    l = min(len(left), len(right))

    # Find all indices in `right` where the element matches the last element of `left`.
    # Need to sort, since the `nonzero` documentation doesn't
    # explicitly state whether the returned indices follow the order
    # as in `right`
    # As long as there are few matches, sorting will not be a showstopper
    # Need to reverse the sorted array, to start from the back of the
    # right array, work towards the front, until there is a proper match
    for i in np.sort(np.nonzero(right[:l] == left[-1])[0])[::-1]:
        # Check if the subarrays are equal
        if np.all(left[-i-1:] == right[:i+1]):
            return np.concatenate([left, right[i+1:]])
    # No match
    return np.concatenate([left, right])


a = np.array([1, 2, 4])
b = np.array([2, 4, 5])
c = np.array([2, 5, 4])
d = np.array([1, 2, 4, 5])
e = np.array([1, 2, 4, 2])
f = np.array([2, 4, 2, 5])

print(merge(a, b))
print(merge(a, c))
print(merge(a, d))
print(merge(e, b))
print(merge(e, f))

产量

[1 2 4 5]
[1 2 4 2 5 4]
[1 2 4 5]
[1 2 4 2 4 5]
[1 2 4 2 5]

你可以这样做。

def concatenate(a,b):
    ret = a.copy()
    for element in b:
        if not element in ret:
            ret.append(element)
    return ret

这使订单保持在+ b形式。

我有一个O(n)解决方案,虽然没有Numpy:

def merge(a, b):
    n_a = len(a)
    n = min(n_a, len(b))
    m = 0
    for i in range(1, n + 1):
        if b[n - i] == a[n_a - 1 - m]:
            m += 1
        else:
            m = 0
    return a + b[m:]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM