简体   繁体   English

从集合中创建 numpy 数组

[英]Creating a numpy array from a set

I noticed the following behaviour exhibited by numpy arrays:我注意到 numpy arrays 表现出以下行为:

>>> import numpy as np
>>> s = {1,2,3}
>>> l = [1,2,3]
>>> np.array(l)
array([1, 2, 3])
>>> np.array(s)
array({1, 2, 3}, dtype=object)
>>> np.array(l, dtype='int')
array([1, 2, 3])
>>> np.array(l, dtype='int').dtype
dtype('int64')
>>> np.array(s, dtype='int')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a number, not 'set'

There are 2 things to notice:有2点需要注意:

  1. Creating an array from a set results in the array dtype being object从集合创建数组会导致数组dtypeobject
  2. Trying to specify dtype results in an error which suggests that the set is being treated as a single element rather than an iterable.尝试指定 dtype 会导致错误,这表明该集合被视为单个元素而不是可迭代的。

What am I missing - I don't fully understand which bit of python I'm overlooking.我错过了什么 - 我不完全理解我忽略了 python 的哪一点。 Set is a mutable object much like a list is. Set 是一个可变的 object 很像一个列表。

EDIT: tuples work fine:编辑:元组工作正常:

>>> t = (1,2,3)
>>> np.array(t)
array([1, 2, 3])
>>> np.array(t).dtype
dtype('int64')

The array factory works best with sequence objects which a set is not. array工厂最适用于集合不是的序列对象。 If you do not care about the order of elements and know they are all ints or convertible to int, then you can use np.fromiter如果您不关心元素的顺序并且知道它们都是 int 或可转换为 int,那么您可以使用np.fromiter

np.fromiter({1,2,3},int,3)
# array([1, 2, 3])

The second (dtype) argument is mandatory;第二个(dtype)参数是强制性的; the last (count) argument is optional, providing it can improve performance.最后一个 (count) 参数是可选的,前提是它可以提高性能。

As you can see from the syntax of using curly brackets, a set are more closely related to a dict than to a list .从使用大括号的语法中可以看出, setdict的关系比与list的关系更密切。 You can solve it very simply by turning the set into a list or tuple before converting to an array:您可以通过在转换为数组之前将集合转换为listtuple来非常简单地解决它:

>>> import numpy as np
>>> s = {1,2,3}
>>> np.array(s)
array({1, 2, 3}, dtype=object)
>>> np.array(list(s))
array([1, 2, 3])
>>> np.array(tuple(s))
array([1, 2, 3])

However this might be too inefficient for large sets, because the list or tuple functions have to run through the whole set before even starting the creation of the array.然而,这对于大型集合来说可能效率太低,因为listtuple函数必须在开始创建数组之前遍历整个set A better method would be to use the set as an iterator:更好的方法是将set用作迭代器:

>>> np.fromiter(s, int)
array([1, 2, 3])

The np.array documentation says that the object argument must be "an array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence " (emphasis added). np.array 文档object参数必须是“一个数组,任何暴露数组接口的 object,一个 object,其__array__方法返回一个数组,或任何(嵌套)。

A set is not a sequence .集合不是序列 Specifically, sets are unordered and do not support the __getitem__ method.具体来说,集合是无序的,不支持__getitem__方法。 Hence you cannot create an array from a set like you trying to with the list.因此,您不能像尝试使用列表那样从集合中创建数组。

Numpy expects the argument to be a list, it doesn't understand the set type so it creates an object array (this would be the same if you passed any other non sequence object). Numpy 期望参数是一个列表,它不理解集合类型,因此它创建一个 object 数组(如果您传递任何其他非序列对象,这将是相同的)。 You can create a numpy array with a set by first converting the set to a list numpy.array(list(my_set)) .您可以通过首先将集合转换为列表numpy.array(list(my_set))来创建带有集合的 numpy 数组。 Hope this helps.希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM