通过python在列表中找到不同的对

Question

I have a list and I want to find different pair in list. 我有一个列表，我想在列表中找到不同的对。 I implement a function --> different() 我实现了一个函数 - > different（）

import numpy as np


def different(array):
    res = []
    for (x1, y1), (x2, y2) in array:
        if (x1, y1) != (x2, y2):
            res.append([(x1, y1), (x2, y2)])
    return res


a = np.array([[[1, 2], [3, 4]],
              [[1, 2], [1, 2]],
              [[7, 9], [6, 3]],
              [[3, 3], [3, 3]]])

out = different(a)  # get [[(1, 2), (3, 4)],
                    #      [(7, 9), (6, 3)]]

Is there any other better way to do it? 还有其他更好的方法吗？ I want to improve my function different . 我想提高我的功能有所不同 。 List size may be greater than 100,000. 列表大小可能大于100,000。

Answer 1

The numpy way to do it is 这种笨拙的方式是

import numpy as np

a = np.array([[[1, 2], [3, 4]],
              [[1, 2], [1, 2]],
              [[7, 9], [6, 3]],
              [[3, 3], [3, 3]]])

b = np.logical_or(a[:,0,0] != a[:,1,0],  a[:,0,1] != a[:,1,1])

print(a[b])

Answer 2

Vectorized Comparison 矢量化比较

a[~(a[:, 0] == a[:, 1]).all(1)]

array([[[1, 2],
        [3, 4]],

       [[7, 9],
        [6, 3]]])

This works by taking the first pair of each subarray and comparing each one with the second pair. 这是通过获取每个子阵列的第一对并将每个子阵列与第二对子阵列进行比较来实现的。 All subarrays for which entries which are not identical only are selected. 仅选择不相同的条目的所有子阵列。 Consider, 考虑，

a[:, 0] == a[:, 1]

array([[False, False],
       [ True,  True],
       [False, False],
       [ True,  True]])

From this, we want those rows which do not have True at each column. 从这里，我们希望每列中没有True的那些行。 So, on this result, use all and then negate the result. 因此，在此结果上，使用all然后否定结果。

~(a[:, 0] == a[:, 1]).all(1)
array([ True, False,  True, False])

This gives you a mask you can then use to select subarrays from a . 这为您提供了一个掩码，然后您可以使用它来从a选择子阵列。

`np.logical_or.reduce`

Similar to the first option above, but approaches this problem from the other end (see DeMorgan's Law). 与上面的第一个选项类似，但从另一端接近这个问题（参见DeMorgan定律）。

a[np.logical_or.reduce(a[:, 0] != a[:, 1], axis=1)]

Answer 3

Solutions time comparisons 解决方案时间比较

When there are so many different approaches to a problem, time comparisons can really help sort out the better answers. 当存在许多不同的问题方法时，时间比较可以真正帮助找出更好的答案。

Setup 设定

We use an array of size (200000, 2, 2) as OP Vincentlai pointed out that is in the range of the expected array size. 我们使用一个大小为(200000, 2, 2) 200000,2,2 (200000, 2, 2)的数组，因为OP Vincentlai指出它在预期的数组大小范围内。

a = np.array(np.random.randint(10, size=(200000, 2, 2)))

Using Joe answer: `numpy.logical_and` 使用Joe回答： `numpy.logical_and`

%timeit b = a[np.logical_and(a[:,0,0] != a[:,1,0],  a[:,0,1] != a[:,1,1])]
>>> 5.12 ms ± 110 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using Coldspeed first answer: vectorised comparison 使用Coldspeed第一个答案：矢量化比较

%timeit b = a[~(a[:, 0] == a[:, 1]).all(1)]
>>> 13.7 ms ± 559 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using Coldspeed second answer: `numpy.logical_or` 使用Coldspeed第二个答案： `numpy.logical_or`

%timeit b = a[np.logical_or.reduce(a[:, 0] != a[:, 1], axis=1)]
>>> 13.2 ms ± 498 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using U9 Forward answer: filters 使用U9转发答案：过滤器

%timeit b = list(filter(lambda x: x[0]!=x[1],a.tolist()))
>>> 102 ms ± 4.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using aydow answer: filters 使用aydow答案：过滤器

%timeit b = [[(x1, y1), (x2, y2)] for (x1, y1), (x2, y2) in a if (x1, y1) != (x2, y2)]
>>> 752 ms ± 11.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Conclusions 结论

Joe's approach with numpy.logical_and is by far the faster one. Joe的numpy.logical_and方法是迄今为止更快的方法。 Predictably, every full python approach falls extremely short to anything numpy. 可以预见的是，每个完整的python方法都非常短暂。

Answer 4

Try using filter : 尝试使用filter ：

import numpy as np

def different(array):   
   return list(filter(lambda x: x[0]!=x[1],array.tolist()))

a = np.array([[[1, 2], [3, 4]],
              [[1, 2], [1, 2]],
              [[7, 9], [6, 3]],
              [[3, 3], [3, 3]]])

out = different(a)
print(out)

Answer 5

By using list comprehension in one line we can do like as below, 通过在一行中使用列表理解，我们可以像下面这样做，

items_list = [[[1, 2], [3, 4]],
              [[1, 2], [1, 2]],
              [[7, 9], [6, 3]],
              [[3, 3], [3, 3]]
             ]

# Output
[itm for itm in items_list if itm[0] != itm[1]]

Answer 6

Use a list comprehension 使用列表理解

def different(array):
    return [[(x1, y1), (x2, y2)] for (x1, y1), (x2, y2) in array if (x1, y1) != (x2, y2)]

通过python在列表中找到不同的对

问题描述

6 个解决方案

解决方案1
8 已采纳 2018-07-16 05:33:04

解决方案2
6 2018-07-16 05:33:06

Vectorized Comparison 矢量化比较

`np.logical_or.reduce`

解决方案3
3

Solutions time comparisons 解决方案时间比较

Setup 设定

Using Joe answer: `numpy.logical_and` 使用Joe回答： `numpy.logical_and`

Using Coldspeed first answer: vectorised comparison 使用Coldspeed第一个答案：矢量化比较

Using Coldspeed second answer: `numpy.logical_or` 使用Coldspeed第二个答案： `numpy.logical_or`

Using U9 Forward answer: filters 使用U9转发答案：过滤器

Using aydow answer: filters 使用aydow答案：过滤器

Conclusions 结论

解决方案4
1 2018-07-16 05:31:41

解决方案5
1 2018-07-16 06:13:00

解决方案6
0 2018-07-16 05:30:56

通过python在列表中找到不同的对

问题描述

6 个解决方案

解决方案1 8 已采纳 2018-07-16 05:33:04

解决方案2 6 2018-07-16 05:33:06

Vectorized Comparison 矢量化比较

np.logical_or.reduce

解决方案3 3

Solutions time comparisons 解决方案时间比较

Setup 设定

Using Joe answer: numpy.logical_and 使用Joe回答： numpy.logical_and

Using Coldspeed first answer: vectorised comparison 使用Coldspeed第一个答案：矢量化比较

Using Coldspeed second answer: numpy.logical_or 使用Coldspeed第二个答案： numpy.logical_or

Using U9 Forward answer: filters 使用U9转发答案：过滤器

Using aydow answer: filters 使用aydow答案：过滤器

Conclusions 结论

解决方案4 1 2018-07-16 05:31:41

解决方案5 1 2018-07-16 06:13:00

解决方案6 0 2018-07-16 05:30:56

解决方案1
8 已采纳 2018-07-16 05:33:04

解决方案2
6 2018-07-16 05:33:06

`np.logical_or.reduce`

解决方案3
3

Using Joe answer: `numpy.logical_and` 使用Joe回答： `numpy.logical_and`

Using Coldspeed second answer: `numpy.logical_or` 使用Coldspeed第二个答案： `numpy.logical_or`

解决方案4
1 2018-07-16 05:31:41

解决方案5
1 2018-07-16 06:13:00

解决方案6
0 2018-07-16 05:30:56