简体   繁体   English

追加到numpy数组

[英]Appending to numpy arrays

I'm trying to construct a numpy array, and then append integers and another array to it. 我正在尝试构造一个numpy数组,然后向其附加整数和另一个数组。 I tried doing this: 我尝试这样做:

xyz_list = frag_str.split()
nums = numpy.array([])
coords = numpy.array([])
for i in range(int(len(xyz_list)/4)):
    numpy.append(nums, xyz_list[i*4])
    numpy.append(coords, xyz_list[i*4+1:(i+1)*4])
print(atoms)
print(coords)

Printing out the output only gives my empty arrays. 打印输出仅给出我的空数组。 Why is that? 这是为什么? In addition, how can I rewrite coords in a way that allows me to have 2D arrays like this: array[[0,0,0],[0,0,1],[0,0,-1]] ? 另外,如何以一种允许我拥有2D数组的方式重写coordsarray[[0,0,0],[0,0,1],[0,0,-1]]

numpy.append , unlike python's list.append , does not perform operations in place. numpy.append与python的list.append不同,它不会执行适当的操作。 Therefore, you need to assign the result back to a variable, as below. 因此,您需要将结果分配回一个变量,如下所示。

import numpy

xyz_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
nums = numpy.array([])
coords = numpy.array([])

for i in range(int(len(xyz_list)/4)):
    nums = numpy.append(nums, xyz_list[i*4])
    coords = numpy.append(coords, xyz_list[i*4+1:(i+1)*4])

print(nums)    # [ 1.  5.  9.]
print(coords)  # [  2.   3.   4.   6.   7.   8.  10.  11.  12.]

You can reshape coords as follows: 您可以按以下方式重塑coords

coords = coords.reshape(3, 3)

# array([[  2.,   3.,   4.],
#        [  6.,   7.,   8.],
#        [ 10.,  11.,  12.]])

More details on numpy.append behaviour 有关numpy.append行为的更多详细信息

Documentation : 说明文件

Returns: A copy of arr with values appended to axis. 返回:arr的副本,其值附加在axis上。 Note that append does not occur in-place: a new array is allocated and filled. 请注意,append不会就地发生:分配并填充了一个新数组。

If you know the shape of your numpy array output beforehand, it is efficient to instantiate via np.zeros(n) and fill it with results later. 如果您事先知道numpy数组输出的形状,则可以通过np.zeros(n)进行实例化, np.zeros(n)其填充结果。

Another option: if your calculations make heavy use of inserting elements to the left of an array, consider using collections.deque from the standard library. 另一个选择:如果您的计算大量使用了在数组左侧插入元素的方法 ,请考虑使用标准库中的collections.deque

np.append is not a list clone. np.append不是列表克隆。 It is a clumsy wrapper to np.concatenate . 它是np.concatenate的笨拙包装器。 It is better to learn to use that correctly. 最好学习正确使用它。

xyz_list = frag_str.split()
nums = []
coords = []
for i in range(int(len(xyz_list)/4)):
    nums.append(xyz_list[i*4])
    coords.append(xyz_list[i*4+1:(i+1)*4])
nums = np.concatenate(nums)
coords = np.concatenate(coords)

List append is faster, and easier to initialize. 列表附加更快,更易于初始化。 np.concatenate works fine with a list of arrays. np.concatenate可以很好地处理数组列表。 np.append uses concatenate , but only accepts two inputs. np.append使用concatenate ,但仅接受两个输入。 np.array is needed if the list contains numbers or strings. 如果列表包含数字或字符串,则需要np.array


You don't give an example of frag_str . 您没有给出frag_str的示例。 But the name and the use of split suggests it is a string. 但是名称和split的用法表明它是一个字符串。 I don't think anything else has a split method. 我认为没有其他方法可以使用split方法。

In [74]: alist = 'one two three four five six seven eight'.split()

That's a list of strings. 那是一个字符串列表。 Using your indexing I can construct 2 lists: 使用索引,我可以构造2个列表:

In [76]: [alist[i*4] for i in range(2)]
Out[76]: ['one', 'five']

In [77]: [alist[i*4+1:(i+1)*4] for i in range(2)]
Out[77]: [['two', 'three', 'four'], ['six', 'seven', 'eight']]

And I can make arrays from each of those lists: 我可以从每个列表中创建数组:

In [78]: np.array(Out[76])
Out[78]: array(['one', 'five'], dtype='<U4')
In [79]: np.array(Out[77])
Out[79]: 
array([['two', 'three', 'four'],
       ['six', 'seven', 'eight']], dtype='<U5')

In the first case the array is 1d, in the second, 2d. 在第一种情况下,数组为1d,在第二种情况下为2d。

It the string contains digits, we can make an integer array by specifying dtype . 如果字符串包含数字,则可以通过指定dtype来创建整数数组。

In [80]: alist = '1 2 3 4 5 6 7 8'.split()
In [81]: np.array([alist[i*4] for i in range(2)])
Out[81]: array(['1', '5'], dtype='<U1')
In [82]: np.array([alist[i*4] for i in range(2)], dtype=int)
Out[82]: array([1, 5])

As stated above, numpy.append does not append items in place, but the reason why is important. 如上所述, numpy.append不会在适当位置附加项目,但是重要的原因。 You must store the returned array from numpy.append to the original variable, or else your code will not work. 您必须将返回的数组从numpy.append存储到原始变量,否则您的代码将无法工作。 That being said, you should likely rethink your logic. 话虽如此,您可能应该重新考虑自己的逻辑。

Numpy uses C-style arrays internally, which are arrays in contiguous memory without leading or trailing unused elements. Numpy在内部使用C样式的数组,它们是连续内存中没有前导或尾随未使用元素的数组。 In order to append an item to an array, Numpy must allocate a buffer of the array size + 1, copy all the data over, and add the appended element. 为了将项目附加到数组,Numpy必须分配一个数组大小为+ 1的缓冲区,复制所有数据,然后添加附加的元素。

In pseudo-C code, this comes to the following: 在伪C代码中,这涉及以下内容:

int* numpy_append(int* arr, size_t size, int element)
{
    int* new_arr = malloc(sizeof(int) * (size+1);
    mempcy(new_arr, arr, sizeof(int) * size);
    new_arr[size] = element;
    return new_arr;
}

This is extremely inefficient, since a new array must be allocated each time (memory allocation is slow), all the elements must be copied over, and the new element added to the end of the new array. 这是非常低效的,因为每次都必须分配一个新数组(内存分配很慢),必须复制所有元素,并将新元素添加到新数组的末尾。

In comparison, Python lists reserve extra elements beyond the size of the container, until the size is the same as the capacity of the list, and grow exponentially. 相比之下,Python列表保留了超出容器大小的额外元素,直到大小与列表的容量相同,并且呈指数增长。 This is much more efficient for insertions at the end of the container than reallocating the entire buffer each time. 与每次重新分配整个缓冲区相比,这对于在容器末尾插入更为有效。

You should use Python lists and list.append , and then convert the new list to a NumPy array. 您应该使用Python列表和list.append ,然后将新列表转换为NumPy数组。 Or, if performance is truly critical, use a C++-extension using std::vector rather than numpy.append in all scenarios. 或者,如果性能确实很关键, numpy.append在所有情况下都使用使用std::vector而不是numpy.append的C ++扩展名。 Re-write your code, or it will be glacial. 重新编写您的代码,否则会很麻烦。

Edit 编辑

Also,as pointed out in the comments, if you know the size of a Numpy array before hand, pre-allocating it with np.zeros(n) is efficient, as is using a custom wrapper around a NumPy array 另外,正如注释中指出的那样,如果您事先知道Numpy数组的大小,则使用np.zeros(n)预分配非常有效,就像在NumPy数组周围使用自定义包装器一样

class extendable_array:
    def __init__(self, size=0, dtype=np.int):
        self.arr = np.array(dtype=dtype)
        self.size = size

    def grow(self):
        '''Double the array'''

        arr = self.arr
        self.arr = np.zeros(min(arr.size * 2, 1), dtype=arr.dtype)
        self.arr[:arr.size] = arr

    def append(self, value):
        '''Append a value to the array'''

        if self.arr.size == self.size:
            self.grow()

        self.arr[self.size] = value
        self.size += 1.

    # add more methods here

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM