简体   繁体   English

在numpy中追加数组

[英]Appending arrays in numpy

I have a loop that reads through a file until the end is reached. 我有一个循环读取文件,直到到达末尾。 On each pass through the loop, I extract a 1D numpy array. 在循环的每次遍历中,我提取一维numpy数组。 I want to append this array to another numpy array in the 2D direction. 我想将此数组沿2D方向附加到另一个numpy数组。 That is, I might read in something of the form 也就是说,我可能会以某种形式阅读

x = [1,2,3]

and I want to append it to something of the form 我想将其附加到某种形式

z = [[0,0,0],
     [1,1,1]]

I know I can simply do z = numpy.append([z],[x],axis = 0) and achieve my desired result of 我知道我可以简单地做z = numpy.append([z],[x],axis = 0)并达到我想要的结果

z = [[0,0,0],
     [1,1,1],
     [1,2,3]]

My issue comes from the fact that in the first run through the loop, I don't have anything to append to yet because first array read in is the first row of the 2D array. 我的问题来自以下事实:在循环的第一次运行中,我没有任何要追加的内容,因为读入的第一个数组是2D数组的第一行。 I dont want to have to write an if statement to handle the first case because that is ugly. 我不想编写if语句来处理第一种情况,因为那很丑。 If I were working with lists I could simply do z = [] before the loop and every time I read in an array, simply do z.append(x) to achieve my desired result. 如果使用列表,则可以在循环之前以及每次读取数组时简单地执行z = [] ,只需执行z.append(x)即可达到所需的结果。 However I can find no way doing a similar procedure in numpy. 但是我找不到在numpy中执行类似过程的方法。 I can create an empty numpy array, but then I can't append to it in the way I want. 我可以创建一个空的numpy数组,但随后无法按照我想要的方式附加到它。 Can anyone help? 有人可以帮忙吗? Am I making any sense? 我说得通吗

EDIT: 编辑:

After some more research, I found another workaround that does technically do what I want although I think I will go with the solution given by @Roger Fan given that numpy appending is very slow. 经过更多研究后,我发现另一个解决方法在技术上可以实现我想要的功能,尽管鉴于numpy追加非常慢,我认为我会使用@Roger Fan提供的解决方案。 I'm posting it here just so its out there. 我将其发布在此处,以便将其发布。

I can still define z = [] at the beginning of the loop. 我仍然可以在循环开始时定义z = [] Then append my arrays with `np.append(z, x). 然后用`np.append(z,x)附加我的数组。 This will ultimately give me something like 最终会给我一些像

z = [0,0,0,1,1,1,1,2,3]

Then, because all the arrays I read in are of the same size, after the loop I can simply resize with `np.resize(n, m)' and get what I'm after. 然后,由于我读取的所有数组的大小都相同,因此在循环之后,我可以简单地使用“ np.resize(n,m)”调整大小并得到我想要的。

Don't do it. 不要这样 Read the whole file into one array, using for example numpy.genfromtext() . 使用例如numpy.genfromtext()将整个文件读取为一个数组。

With this one array, you can then loop over the rows, loop over the columns, and perform other operations using slices. 使用这个数组,您可以在行上循环,在列上循环以及使用切片执行其他操作。

Alternatively, you can create a regular list, append a lot of arrays to that list, and in the end generate your desired array from the list using either numpy.array(list_of_arrays) or, for more control, numpy.vstack(list_of_arrays) . 或者,你可以创建一个普通的名单,很多阵列添加到列表中,并最终产生来自使用列表中所需阵列numpy.array(list_of_arrays)或进行更多的控制, numpy.vstack(list_of_arrays)

The idea in this second approach is "delayed array creation": find and organize your data first, and then create the desired array once, already in its final form. 第二种方法的想法是“延迟数组创建”:首先查找并组织数据,然后一次创建所需的数组(已经以其最终形式)。

As @heltonbiker mentioned in his answer, something like np.genfromtext is going to be the best way to do this if it fits your needs. 就像@heltonbiker在他的回答中提到的那样,如果满足您的需求,像np.genfromtext这样的东西将是最好的方法。 Otherwise, I suggest reading the answers to this question about appending to numpy arrays. 否则,建议阅读有关追加到numpy数组的此问题的答案。 Basically, numpy array appending is extremely slow and should be avoided whenever possible. 基本上,numpy数组附加非常慢,应尽可能避免。 There are two much better (and faster by about 20x) solutions: 有两种更好(和快约20倍)的解决方案:

If you know the length in advance, you can preallocate your array and assign to it. 如果事先知道长度,则可以预分配数组并分配给它。

length_of_file = 5000
results = np.empty(length_of_file)
with open('myfile.txt', 'r') as f:
    for i, line in enumerate(f):
        results[i] = processing_func(line)

Otherwise, just keep a list of lists or list of arrays and convert it to a numpy array all at once. 否则,只需保留一个列表列表或数组列表,然后一次将其转换为numpy数组即可。

results = []
with open('myfile.txt', 'r') as f:
    for line in f:
        results.append(processing_func(line))
results = np.array(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM