简体   繁体   English

在数组列表中查找数组存在的更好方法

[英]Better way to find existence of arrays in list of arrays

I am a newbie to Python and trying out different ways to optimize and simplify my code. 我是Python的新手,正在尝试各种方法来优化和简化我的代码。

I have a list of arrays(necessarily in this format) initially empty, which I need to update with arrays, making sure that duplicate entries are not added. 我有一个数组列表(必要时采用这种格式),最初是空的,需要使用数组进行更新,以确保不添加重复的条目。

Right now I am doing it the following way, which is the only thing i tried out which works: 现在,我正在按照以下方式进行操作,这是我尝试过的唯一可行的方法:

if len(where(((array(self.pop_next)-(self.pop[self.top_indv_indx[i]]))==0).sum(1)==len((self.pop[self.top_indv_indx[i]])))[0])<=0):
     self.pop_next.append(self.pop[self.top_indv_indx[i]])

where self.pop_next is my list of arrays and self.pop[self.top_indv_indx[i]] is the array to be added. 其中self.pop_next是我的数组列表,而self.pop[self.top_indv_indx[i]]是要添加的数组。

I know this Unpythonic and guess that there are much better simple ways to do the same. 我知道这个Unpythonic,并且猜想有更好的简单方法可以做到这一点。 Please Help 请帮忙

Edit: I see from your comment that you're using numpy arrays. 编辑:我从您的评论中看到您正在使用numpy数组。 I've never used numpy so I have no idea how they work with sets. 我从未使用过numpy,所以不知道它们如何与集合一起使用。

One option would be to use a set . 一种选择是使用set Sets are like lists but they are unordered and only allow each item to be added once: 集就像列表一样,但是它们是无序的,并且只允许每个项目添加一次:

>>> s = set()
>>> s.add(1)
>>> s.add(2)
>>> s.add(2)
>>> s.add(2)
>>> s
set([1, 2])

However, you'll run into problems if you try to add a list to a set: 但是,如果尝试将list添加到集合中,则会遇到问题:

>>> s.add(['my','list'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

An item must be hashable to add to set , and a list can't be hashable as it can't have an unchanging hash value since it can be modified at any time by adding or removing values. 必须是可哈希的才能添加到set ,并且list不能哈希,因为它不能具有不变的哈希值,因为可以随时通过添加或删除值对其进行修改。

If you don't need the lists you are checking to be mutable you can convert them to tuples which are fixed and so hashable and so set-friendly: 如果您不需要检查列表是否可变,则可以将其转换为固定的,可哈希的和易于设置的元组:

>>> mylist = ['my','list']
>>> s = set()
>>> s.add(tuple(mylist))
>>> s.add(tuple(mylist))
>>> s
set([('my', 'list')])

You may want to try with numpy.all(array1 == array2) as condition for an individual array comparison. 您可能要尝试使用numpy.all(array1 == array2)作为单个数组比较的条件。

Extension in edit: 编辑扩展名:

To loop over the list, you may use the following: 要遍历列表,可以使用以下命令:

if all((numpy.all(array_to_add != a) for a in array_list)):
    array_list.append(array_to_add)

This compares array_to_add to all elements of array_list by value. array_list按值将array_to_addarray_to_add的所有元素进行比较。 Note that all here is __builtin__.all , in contrast to numpy.all . 请注意,与numpy.all相比,这里all都是__builtin__.all If you did from numpy import * before, this will not work. 如果您以前from numpy import *过,则此操作将无效。 Use import numpy instead and call functions by full name as in the example above. 改用import numpy并按上例中的全名调用函数。

If it is ok to compare by object (ie two arrays are only the same if the are the exact same object in memory), use the following simpler variant: 如果可以按对象进行比较(例如,两个数组仅在内存中是完全相同的对象时才是相同的),请使用以下更简单的变体:

if array_to_add is not in array_list:
    array_list.append(array_to_add)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM