在“,”字符上拆分 2D NumPy 字符串数组

Question

I have a 2D NumPy of strings array like: a = array([['1,2,3'], ['3,4,5']], dtype=object) and I would like to convert it into a 2D Numpy array like this: a = array([['1','2','3'], ['4','5','6']]) .我有一个 2D NumPy 字符串数组，例如： a = array([['1,2,3'], ['3,4,5']], dtype=object)我想将它转换为 2D像这样的 Numpy 数组： a = array([['1','2','3'], ['4','5','6']]) 。 I would then like to also convert the strings to floats, so the final array would look like this: a = array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) .然后，我还想将字符串转换为浮点数，因此最终数组将如下所示： a = array([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) 。 Any help is greatly appreciated.任何帮助是极大的赞赏。

Answer 1

Since it's an object array, we might as well iterate and use plain python split:由于它是一个对象数组，我们不妨迭代并使用普通的python拆分：

In [118]: a = np.array([['1,2,3'], ['3,4,5']], dtype=object)
In [119]: a.shape
Out[119]: (2, 1)
In [120]: np.array([x.split(',') for x in a.ravel()])
Out[120]: 
array([['1', '2', '3'],
       ['3', '4', '5']], dtype='<U1')
In [122]: np.array([x.split(',') for x in a.ravel()],dtype=float)
Out[122]: 
array([[1., 2., 3.],
       [3., 4., 5.]])

I raveled it to simplify iteration.我用它来简化迭代。 Plus the result doesn't need that 2nd size 1 dimension.此外，结果不需要第二个尺寸 1 维。

There is a np.char function that applies split to elements of an array, but the result is messier:有一个np.char函数将split应用于数组的元素，但结果更混乱：

In [129]: a.astype(str)
Out[129]: 
array([['1,2,3'],
       ['3,4,5']], dtype='<U5')
In [130]: np.char.split(_, sep=',')
Out[130]: 
array([[list(['1', '2', '3'])],
       [list(['3', '4', '5'])]], dtype=object)
In [138]: np.stack(Out[130].ravel()).astype(float)
Out[138]: 
array([[1., 2., 3.],
       [3., 4., 5.]])

Another way:其它的办法：

In [132]: f = np.frompyfunc(lambda astr: np.array(astr.split(','),float),1,1)
In [133]: f(a)
Out[133]: 
array([[array([1., 2., 3.])],
       [array([3., 4., 5.])]], dtype=object)
In [136]: np.stack(_.ravel())
Out[136]: 
array([[1., 2., 3.],
       [3., 4., 5.]])

Answer 2

Iterate through rows and use split(',') to split each row at the commas, and put the result in a new numpy array with a numeric data type:遍历行并使用split(',')在逗号处拆分每一行，并将结果放入具有数字数据类型的新 numpy 数组中：

import numpy as np

a = np.array([['1,2,3'], ['3,4,5']])
b = np.array([x[0].split(',') for x in a], dtype=np.float32)
print(b)

#[[ 1.  2.  3.]
# [ 3.  4.  5.]]

Answer 3

I would like to propose this if you don't mind having them as a vector如果您不介意将它们作为载体，我想提出这个建议

np.array([["asa,asd"], ["dasd,asdaf,asfasf"]], dtype=object)
Out[31]: 
array([['asa,asd'],
      ['dasd,asdaf,asfasf']], dtype=object)
np.concatenate(np.char.split(Out[31].astype(str), ",").ravel())
Out[32]: array(['asa', 'asd', 'dasd', 'asdaf', 'asfasf'], dtype='<U6')

在“,”字符上拆分 2D NumPy 字符串数组

问题描述

3 个解决方案

解决方案1
1 已采纳 2018-05-22 04:49:09

解决方案2
0 2018-05-22 04:52:59

解决方案3
0 2021-01-07 11:53:32

在“,”字符上拆分 2D NumPy 字符串数组

问题描述

3 个解决方案

解决方案1 1 已采纳 2018-05-22 04:49:09

解决方案2 0 2018-05-22 04:52:59

解决方案3 0 2021-01-07 11:53:32

解决方案1
1 已采纳 2018-05-22 04:49:09

解决方案2
0 2018-05-22 04:52:59

解决方案3
0 2021-01-07 11:53:32