[英]Add Columns to an empty data frame in R
I have searched extensively but not found an answer to this question on Stack Overflow.我进行了广泛的搜索,但没有在 Stack Overflow 上找到这个问题的答案。
Lets say I have a data frame a.假设我有一个数据框 a。
I define:我定义:
a <- NULL
a <- as.data.frame(a)
If I wanted to add a column to this data frame as so:如果我想像这样向这个数据框添加一列:
a$col1 <- c(1,2,3)
I get the following error:我收到以下错误:
Error in `$<-.data.frame`(`*tmp*`, "a", value = c(1, 2, 3)) :
replacement has 3 rows, data has 0
Why is the row dimension fixed but the column is not?为什么行维度是固定的而列不是?
How do I change the number of rows in a data frame?如何更改数据框中的行数?
If I do this (inputting the data into a list first and then converting to a df), it works fine:如果我这样做(首先将数据输入列表然后转换为 df),它工作正常:
a <- NULL
a$col1 <- c(1,2,3)
a <- as.data.frame(a)
The row dimension is not fixed, but data.frames are stored as list of vectors that are constrained to have the same length. 行维度不固定,但data.frames存储为受限制为具有相同长度的向量列表。 You cannot add col1
to a
because col1
has three values (rows) and a
has zero, thereby breaking the constraint. 您不能将col1
添加到a
因为col1
具有三个值(行)且a
具有零,从而破坏了约束。 R does not by default auto-vivify values when you attempt to extend the dimension of a data.frame by adding a column that is longer than the data.frame. 当您尝试通过添加比data.frame更长的列来扩展data.frame的维度时,R不会默认自动生成值。 The reason that the second example works is that col1
is the only vector in the data.frame so the data.frame is initialized with three rows. 第二个示例的工作原因是col1
是data.frame中唯一的向量,因此data.frame初始化为三行。
If you want to automatically have the data.frame expand, you can use the following function: 如果要自动展开data.frame,可以使用以下函数:
cbind.all <- function (...)
{
nm <- list(...)
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function(x) rbind(x, matrix(, n -
nrow(x), ncol(x)))))
}
This will fill missing values with NA
. 这将使用NA
填充缺失值。 And you would use it like: cbind.all( df, a )
你可以使用它: cbind.all( df, a )
You could also do something like this where I read in data from multiple files, grab the column I want, and store it in the dataframe. 您还可以执行以下操作:我从多个文件中读取数据,获取所需的列,并将其存储在数据框中。 I check whether the dataframe has anything in it, and if it doesn't, create a new one rather than getting the error about mismatched number of rows: 我检查数据框中是否有任何内容,如果没有,请创建一个新的,而不是获得有关行数不匹配的错误:
readCounts = data.frame()
for(f in names(files)){
d = read.table(files[f], header=T, as.is=T)
d2 = round(data.frame(d$NumReads))
colnames(d2) = f
if(ncol(readCounts) == 0){
readCounts = d2
rownames(readCounts) = d$Name
} else{
readCounts = cbind(readCounts, d2)
}
}
if you have an empty dataframe, called for example df, in my opinion another quite simple solution is the following:如果你有一个空的 dataframe,例如 df,我认为另一个非常简单的解决方案如下:
df[1,]=NA # ad a temporary new row of NA values
df[,'new_column'] = NA # adding new column, called for example 'new_column'
df = df[0,] # delete row with NAs
I hope this may help.我希望这可能有所帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.