简体   繁体   English

结合两个 numpy arrays

[英]Combining two numpy arrays

I have a numpy array ar1 with dimensions (1196, 14, 64, 1).我有一个尺寸为(1196、14、64、1)的 numpy 数组ar1

I have a numpy array ar2 with dimensions (1196,).我有一个尺寸为 (1196,) 的 numpy 数组ar2

I want to combine these two because I want to shuffle them later.我想将这两者结合起来,因为我想稍后将它们洗牌。 I thought that this was the kind of thing that zip was made for, but when I zip them together:我认为这就是 zip 的用途,但是当我将 zip 放在一起时:

ar3 = np.asarray(zip(ar1,ar2))

then print(ar3.shape) gives ()然后print(ar3.shape)给出()

I have also tried np.concatenate but apparently all the input array dimensions for the concatenation axis must match exactly .我也尝试过np.concatenate但显然all the input array dimensions for the concatenation axis must match exactly

I tried np.hstack but I got all the input arrays must have same number of dimensions我试过np.hstack但我得到了all the input arrays must have same number of dimensions

How can I combine these two arrays that have the same size along axis 0?如何组合这两个沿轴 0 具有相同大小的 arrays? Perhaps I don't need to combine then and should just shuffle them separately using the same indices.也许那时我不需要组合,而应该使用相同的索引分别对它们进行洗牌。

(I already actually combined and saved these arrays using numpy.savez, but when I load this file into my code, I assume that I have to separate them first and then recombine the in an array as I am trying to do. If I could just pull out a combines array from the.npz file then that would be even better) (我实际上已经使用 numpy.savez 组合并保存了这些 arrays,但是当我将此文件加载到我的代码中时,我假设我必须先将它们分开,然后按照我的尝试将它们重新组合成一个数组。如果我可以只需从 .npz 文件中提取一个组合数组,那就更好了)

Use np.concatenate along with np.broadcast_to , because concatenate needs all the arrays to have the same number of dimensions.使用np.concatenatenp.broadcast_to ,因为连接需要所有 arrays 具有相同的维度数。 This should do the trick:这应该可以解决问题:

x = np.ones((1196, 14, 64, 1))
y = np.arange(1196)
output = np.concatenate((x, np.broadcast_to(y[:, None, None, None], x.shape[:-1] + (1,))), axis=-1)
# output.shape --> (1196, 14, 64, 2)
In [111]: ar1 = np.ones((1196,14,62,1))                                                              
In [112]: ar2 = np.zeros((1196))  

In py3, zip is generator-like.在 py3 中, zip生成器。 You have expand it with list to get an array:你已经用list展开它以获得一个数组:

In [113]: np.array(zip(ar1,ar2))                                                                     
Out[113]: array(<zip object at 0x7f188a465a48>, dtype=object)

That's a single element array, with shape ().那是一个单元素数组,形状为 ()。

If you expand the zip, and make an array, the result is a object dtype array:如果展开 zip 并创建一个数组,则结果是一个 object dtype 数组:

In [119]: A = np.array(list(zip(ar1,ar2)))                                                           
/usr/local/bin/ipython3:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  #!/usr/bin/python3
In [120]: A.shape                                                                                    
Out[120]: (1196, 2)
In [121]: A.dtype                                                                                    
Out[121]: dtype('O')
In [122]: A[0,0].shape                                                                               
Out[122]: (14, 62, 1)
In [123]: A[0,1].shape                                                                               
Out[123]: ()

One column is the (14,62,1) arrays, the other the ar2 values.一列是 (14,62,1) arrays,另一列是ar2值。

The other answer suggests expanding ar2 to match ar1 in shape, and concatenating on the last axis.另一个答案建议扩展ar2以匹配ar1的形状,并在最后一个轴上连接。 The result is twice the size of ar1 .结果是ar1大小的两倍。 broadcast_to does a 'virtual' replication (no increase in memory), but that doesn't carry over in the concatenate. broadcast_to进行“虚拟”复制(不增加内存),但这不会在连接中延续。 You get 14*62 copies of every ar2 value.你得到每个ar2值的 14*62 个副本。

But ar2 could be broadcast to (1196,1,62,1) and concatenate on axis 1 (a 62x replication), or (1196,14,1,1) and axis 2 concatenate (14x).但是ar2可以广播到 (1196,1,62,1) 并在轴 1 上连接(62x 复制),或 (1196,14,1,1) 和轴 2 连接 (14x)。 To concatenate, the arrays have to match on all but one axis.要连接,arrays 必须在除一个轴之外的所有轴上匹配。

But for savez and load you don't need to concatenate the arrays.但是对于savezload ,您不需要连接 arrays。 You can save and load them separately.您可以单独保存和加载它们。 savez puts them in separate npy files. savez将它们放在单独的npy文件中。 savez shows how to load each array. savez展示了如何加载每个数组。

And you can shuffle them separately.你可以单独洗牌。

Make a shuffling index:做一个洗牌索引:

In [124]: idx = np.arange(1196)                                                                      
In [125]: np.random.shuffle(idx)                                                                     
In [126]: idx[:10]                                                                                   
Out[126]: array([ 561,  980,   42,   98, 1055,  375,   13,  771,  832,  787])

and apply it to each array:并将其应用于每个数组:

In [127]: ar1[idx,:,:,:];                                                                            
In [128]: ar2[idx]; 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM