[英]Equivalent of adding a value in a new row/column to numpy that works like R's data.frame
In RI can do: 在RI中可以做到:
> y = c(2,3)
> x = c(4,5)
> z = data.frame(x,y)
> z[3,3]<-6
> z
x y V3
1 4 2 NA
2 5 3 NA
3 NA NA 6
R automatically fills the empty cells with NA. R自动用NA填充空白单元格。 If I use numpy.insert from numpy , numpy throws by default an error:
如果我从numpy使用numpy.insert ,则numpy默认会引发错误:
import numpy
y = [2,3]
x = [4,5]
z = numpy.array([y, x])
z = numpy.insert(z, 3, 6, 3)
IndexError: axis 3 is out of bounds for an array of dimension 2
Is there a way to insert values in a way that works similar to R in numpy? 有没有办法以类似于numpy中的R的方式插入值?
In numpy you need to initialize an array with the appropriate size: 在numpy中,您需要初始化一个具有适当大小的数组:
z = numpy.empty(3, 3)
z.fill(numpy.nan)
z[:2, 0] = x
z[:2, 1] = z
z[3,3] = 6
如果您想在python中获得更像R的体验,我强烈建议您使用pandas ,这是一个基于numpy的更高级别的库,可以执行此类操作。
Looking at the raised error is possible to understand why it occurred: you are trying to insert values in an axes non existent in z
. 查看引发的错误可以理解发生原因:您试图在
z
不存在的轴中插入值。
you can fix it doing 你可以解决它
import numpy as np
y = [2,3]
x = [4,5]
array = np.array([y, x])
z = np.insert(array, 1, [3,6], axis=1))
The interface is quite different from the R's one. 该接口与R的接口完全不同。 If you are using IPython , you can easily access the documentation for some numpy function, in this case
np.insert
, doing: 如果您使用的是IPython ,则可以轻松访问一些numpy函数的文档,在本例中为
np.insert
,它可以:
help(np.insert)
which gives you the function signature, explain each parameter used to call it and provide some examples. 它为您提供了函数签名,解释了用于调用它的每个参数并提供了一些示例。
you could, alternatively do 你可以,或者做
import numpy as np
x = [4,5]
y = [2,3]
array = np.array([y,x])
z = [3,6]
new_array = np.vstack([array.T, z]).T # or, as below
# new_array = np.hstack([array, z[:, np.newaxis])
Also, give a look at the Pandas module. 另外,看看“ 熊猫”模块。 It provides an interface similar to what you asked, implemented with numpy.
它提供与您所要求的接口类似的接口,并通过numpy实现。
With pandas
you could do something like: 使用
pandas
您可以执行以下操作:
import pandas as pd
data = {'y':[2,3], 'x':[4,5]}
dataframe = pd.DataFrame(data)
dataframe['z'] = [3,6]
which gives the nice output: 这给出了很好的输出:
x y z
0 4 2 3
1 5 3 5
numpy is more of a replacement for R's matrices, and not so much for its data frames. numpy可以替代R的矩阵,而不能替代其数据帧。 You should consider using python's pandas library for this.
您应该考虑为此使用python的pandas库。 For example:
例如:
In [1]: import pandas
In [2]: y = pandas.Series([2,3])
In [3]: x = pandas.Series([4,5])
In [4]: z = pandas.DataFrame([x,y])
In [5]: z
Out[5]:
0 1
0 4 5
1 2 3
In [19]: z.loc[3,3] = 6
In [20]: z
Out[20]:
0 1 3
0 4 5 NaN
1 2 3 NaN
3 NaN NaN 6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.