简体   繁体   English

如何有效地连接2个Numpy数组?

[英]How concatenate 2 Numpy array efficiently?

I have 2 Numpy array <type 'numpy.ndarray'> with shape of (10,) (10, 6) and I would like to concat the first one with the second. 我有2个Numpy数组<type 'numpy.ndarray'> ,形状为(10,) (10, 6) <type 'numpy.ndarray'> (10,) (10, 6) ,我想将第一个与第二个连接起来。 The numpy array provided below, 下面提供的numpy数组,

r1 
['467c8100-7f13-4244-81ee-5e2a0f8218a8',
 '71a4b5b2-80d6-4c12-912f-fc71be8d923e',
 '7a3e0168-e47d-4203-98f2-a54a46c62ae0',
 '7dfd43e7-ced1-435f-a0f9-80cfd00ae246',
 '85dbc70e-c773-43ee-b434-8f458d295d10',
 'a56b2bc3-4a81-469e-bc5f-b3aaa520db05',
 'a9e8996f-ff35-4bfb-bbd9-ede5ffecd4d8',
 'c3037410-0c2e-40f8-a844-ac0664a05783',
 'c5618563-10c0-425b-a11b-2fcf931f0ff7',
 'f65e6cea-892e-4335-8e86-bf7f083b5f53'] 

r2 
[[1.55000000e+02, 5.74151515e-01, 1.55000000e+02, 5.74151515e-01, 3.49000000e+02, 1.88383585e+00],
 [5.00000000e+00, 1.91871554e-01, 1.03000000e+02, 1.22893828e+00, 2.95000000e+02, 3.21148368e+00],
 [7.10000000e+01, 1.15231270e-01, 2.42000000e+02, 5.78527276e-01, 4.09000000e+02, 2.67915246e+00],
 [3.60000000e+01, 7.10066720e-01, 2.42000000e+02, 1.80213634e+00, 4.12000000e+02, 4.16314391e+00],
 [1.15000000e+02, 1.05120284e+00, 1.30000000e+02, 1.71697773e+00, 2.53000000e+02, 2.73640301e+00],
 [4.70000000e+01, 2.19434656e-01, 3.23000000e+02, 4.84093786e+00, 5.75000000e+02, 7.00530186e+00],
 [5.50000000e+01, 1.22614463e+00, 1.04000000e+02, 1.55392099e+00, 4.34000000e+02, 4.13661261e+00],
 [3.90000000e+01, 3.34816889e-02, 1.10000000e+02, 2.54431753e-01, 2.76000000e+02, 1.52322736e+00],
 [3.43000000e+02, 2.93550948e+00, 5.84000000e+02, 5.27968165e+00, 7.45000000e+02, 7.57657633e+00],
 [1.66000000e+02, 1.01436635e+00, 2.63000000e+02, 2.69197514e+00, 8.13000000e+02, 7.96477735e+00]]

I tried to concatenate with the command np.concatenate((r1, r2)) , it returns with the message of ValueError: all the input arrays must have same number of dimensions which I don't understand. 我试图用命令np.concatenate((r1, r2)) ,它返回ValueError: all the input arrays must have same number of dimensions消息ValueError: all the input arrays must have same number of dimensions ,我不理解。 Because, the r1 can possibly concat with the r2 and can form a whole new array and make a new array of 10 x 7 as result. 因为, r1可能与r2并置,并且可以形成一个全新的数组并生成一个10 x 7的新数组。

How to solve this problem ? 如何解决这个问题呢 ?

Numpy提供了沿第二个轴连接的简便方法。

np.c_[r2,r1]

You can reshape r1 to make it two-dimensional and specify the axis along which the arrays should be joined: 您可以重塑 r1使其具有二维r1 ,并指定连接数组所沿的axis

import numpy as np

r1 = np.ones((10,))
r2 = np.zeros((10, 6))
np.concatenate((r1.reshape(10, 1), r2), axis=1)

These 2 array have a dtype and shape mismatch: 这2个数组的dtype和形状不匹配:

In [174]: r1.shape
Out[174]: (10,)
In [175]: r1.dtype
Out[175]: dtype('<U36')

In [177]: r2.shape
Out[177]: (10, 6)
In [178]: r2.dtype
Out[178]: dtype('float64')

If you add a dimension to r1 , so it is now (10,1), you can concatenate on axis=1. 如果将尺寸添加到r1 ,现在它是(10,1),则可以在axis = 1上串联。 But note the dtype - the floats have been turned into strings: 但请注意dtype-浮点数已转换为字符串:

In [181]: r12 =np.concatenate((r1[:,None], r2), axis=1)
In [182]: r12.shape
Out[182]: (10, 7)
In [183]: r12.dtype
Out[183]: dtype('<U36')
In [184]: r12[0,:]
Out[184]: 
array(['467c8100-7f13-4244-81ee-5e2a0f8218a8', '155.0', '0.574151515',
       '155.0', '0.574151515', '349.0', '1.88383585'], 
      dtype='<U36')

A way to mix string and floats is with structured array, for example: 混合使用字符串和浮点数的一种方法是使用结构化数组,例如:

In [185]: res=np.zeros((10,),dtype='U36,(6)f')
In [186]: res.dtype
Out[186]: dtype([('f0', '<U36'), ('f1', '<f4', (6,))])
In [187]: res['f0']=r1
In [188]: res['f1']=r2
In [192]: res.shape
Out[192]: (10,)
In [193]: res[0]
Out[193]: ('467c8100-7f13-4244-81ee-5e2a0f8218a8', [ 155.        ,    0.57415152,  155.        ,    0.57415152,  349.        ,    1.88383579])

We could also make a (10,7) array with dtype=object. 我们还可以使用dtype = object创建(10,7)数组。 But most array operations won't work with such a mix of strings and floats. 但是大多数数组操作都无法将字符串和浮点数混合使用。 And the ones that work are slower. 工作的速度较慢。

Why do you want to concatenate these arrays? 为什么要串联这些数组? What do you intend to do with the result? 您打算如何处理结果? That dtype mismatch is more serious than the shape mismatch. dtype不匹配比形状不匹配更严重。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM