简体   繁体   English

python numpy:更改numpy矩阵的列类型

[英]python numpy: Change the column type of a numpy matrix

I have a numpy matrix X, and I tried to change the datatype of column 1 using the code below:我有一个 numpy 矩阵 X,我尝试使用以下代码更改第 1 列的数据类型:

X[:, 1].astype('str')
print(type(X[0, 1]))

but I got the following result:但我得到了以下结果:

<type 'numpy.float64'>

Anyone know why the type was not changed to str ?有人知道为什么类型没有更改为 str 吗? And what is a correct way to change the column type of X?Thanks!更改 X 的列类型的正确方法是什么?谢谢!

Providing a simple example will explain it better.提供一个简单的例子会更好地解释它。

>>> a = np.array([[1,2,3],[4,5,6]])
array([[1, 2, 3],
       [4, 5, 6]])
>>> a[:,1]
array([2, 5])
>>> a[:,1].astype('str') # This generates copy and then cast.
array(['2', '5'], dtype='<U21')
>>> a                    # So the original array did not change.
array([[1, 2, 3],
       [4, 5, 6]])

More clear and straightforward answer.更清晰和直接的答案。 The type was not changed to str because NumPy array should have only one data type.类型没有更改为 str 因为 NumPy 数组应该只有一种数据类型。 The correct way to change the column type of X would be to use structured arrays or one of the solutions from this question .更改 X 的列类型的正确方法是使用结构化数组或此问题的解决方案之一。

I had the same problem, and I didn't want to use structured arrays.我有同样的问题,我不想使用结构化数组。 A possible option is to use pandas if it suits your task.如果适合您的任务,一个可能的选择是使用 Pandas。 If you're going to change just one column, possibly it means that your data is tabular.如果您只想更改一列,则可能意味着您的数据是表格形式。 Then you can easily change the data type of column.然后您可以轻松更改列的数据类型。 Another compromise is to make a copy of the column and use it separately from the original array.另一个折衷是制作列的副本并将其与原始数组分开使用。

>>> x = np.ones((3, 3), dtype=np.float)
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
>>> x[:, 1] = x[:, 1].astype(np.int)
>>> type(x[:, 1][0])
numpy.float64
>>> x_pd = pd.DataFrame(x)
>>> x_pd[1] = x_pd[1].astype(np.int16)
>>> type(x_pd[1][0])
numpy.int16

Let me answer the second question since I have met the same problem.回答第二个问题,因为我也遇到了同样的问题。

As dinarkino mentioned, just assign the type back won't work.正如 dinarkino 所提到的,只分配类型是行不通的。

>>> X = np.array([[1.1,2.2],[3.3,4.4]])
>>> print(X[:,1].dtype)
<class 'numpy.float64'>

>>> X[:,1] = X[:,1].astype('str')
>>> print(X[:,1].dtype)
<class 'numpy.float64'>

So my approach is to assign the dtypes of the whole matrix to 'object' first, then assign the str datatype back.所以我的方法是首先将整个矩阵的 dtypes 分配给 'object',然后将 str 数据类型分配回来。

>>> X = X.astype('object')
>>> print(type(X[0,1]))
<class 'float'>

>>> X[:,1] = X[:,1].astype('str')
>>> print(type(X[0,1]))
<class 'str'>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM