简体   繁体   English

等效于将新行/列中的值添加到numpy中,就像R的data.frame

[英]Equivalent of adding a value in a new row/column to numpy that works like R's data.frame

In RI can do: 在RI中可以做到:

> y = c(2,3)
> x = c(4,5)
> z = data.frame(x,y)
> z[3,3]<-6
> z
   x  y V3
1  4  2 NA
2  5  3 NA
3 NA NA  6

R automatically fills the empty cells with NA. R自动用NA填充空白单元格。 If I use numpy.insert from numpy , numpy throws by default an error: 如果我从numpy使用numpy.insert ,则numpy默认会引发错误:

import numpy

y = [2,3]
x = [4,5]
z = numpy.array([y, x])

z = numpy.insert(z, 3, 6, 3)

IndexError: axis 3 is out of bounds for an array of dimension 2

Is there a way to insert values in a way that works similar to R in numpy? 有没有办法以类似于numpy中的R的方式插入值?

In numpy you need to initialize an array with the appropriate size: 在numpy中,您需要初始化一个具有适当大小的数组:

z = numpy.empty(3, 3)
z.fill(numpy.nan)
z[:2, 0] = x
z[:2, 1] = z
z[3,3] = 6

如果您想在python中获得更像R的体验,我强烈建议您使用pandas ,这是一个基于numpy的更高级别的库,可以执行此类操作。

Looking at the raised error is possible to understand why it occurred: you are trying to insert values in an axes non existent in z . 查看引发的错误可以理解发生原因:您试图在z不存在的轴中插入值。

you can fix it doing 你可以解决它

import numpy as np
y = [2,3]
x = [4,5]
array = np.array([y, x])
z = np.insert(array, 1, [3,6], axis=1))

The interface is quite different from the R's one. 该接口与R的接口完全不同。 If you are using IPython , you can easily access the documentation for some numpy function, in this case np.insert , doing: 如果您使用的是IPython ,则可以轻松访问一些numpy函数的文档,在本例中为np.insert ,它可以:

help(np.insert)

which gives you the function signature, explain each parameter used to call it and provide some examples. 它为您提供了函数签名,解释了用于调用它的每个参数并提供了一些示例。

you could, alternatively do 你可以,或者做

import numpy as np
x = [4,5]
y = [2,3]
array = np.array([y,x])
z = [3,6]
new_array = np.vstack([array.T, z]).T  # or, as below
# new_array = np.hstack([array, z[:, np.newaxis])

Also, give a look at the Pandas module. 另外,看看“ 熊猫”模块。 It provides an interface similar to what you asked, implemented with numpy. 它提供与您所要求的接口类似的接口,并通过numpy实现。

With pandas you could do something like: 使用pandas您可以执行以下操作:

import pandas as pd

data = {'y':[2,3], 'x':[4,5]}
dataframe = pd.DataFrame(data)
dataframe['z'] = [3,6]

which gives the nice output: 这给出了很好的输出:

    x   y   z
0   4   2   3
1   5   3   5

numpy is more of a replacement for R's matrices, and not so much for its data frames. numpy可以替代R的矩阵,而不能替代其数据帧。 You should consider using python's pandas library for this. 您应该考虑为此使用python的pandas库。 For example: 例如:

In [1]: import pandas

In [2]: y = pandas.Series([2,3])

In [3]: x = pandas.Series([4,5])

In [4]: z = pandas.DataFrame([x,y])

In [5]: z
Out[5]: 
   0  1
0  4  5
1  2  3

In [19]: z.loc[3,3] = 6

In [20]: z
Out[20]: 
    0   1   3
0   4   5 NaN
1   2   3 NaN
3 NaN NaN   6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将函数应用于单个R data.frame列:lapply和apply不起作用-寻找与Python的DataFrame.apply(…)等效的R - Applying a function to a single R data.frame column: lapply and apply not working - looking for R equivalent to Python's DataFrame.apply(…) 从数据框中提取行并使其成为新数据框并将其索引更改为列值 - Extract row from a data frame and make it a new data frame and change its index as like column value 如果一个数据框的行值在另一数据框的列中,则创建一个新列并获取该索引 - Create a new column if one dataframe's row value is in another data frame's column and get that index 对数据的第一列/行使用Panda的diff() - Use Panda's diff() against first column/row of data.frame 添加新列后如何更改Pandas Data Frame中的值 - How to change a value in Pandas Data Frame after adding a new column 从分组数据框中的行值创建新列? - Create new column from a row value in a grouped data frame? 基于二进制 (0/1) data.frame 创建新的 data.frame - Create new data.frame based on binary (0/1) data.frame 添加字典作为数据框中的新行 - Adding a Dictionary as a new row in a data frame 使用索引向数据框添加或添加新行 - Appending or adding a new row to data frame with index 在数据框熊猫的特定行中添加一列 - Adding a column to certain row in a data frame pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM