[英]How to add a new row with one value and all other NA values in R
I have a vector of sample IDs that are required to be in my dataframe (otherwise the function I am applying to them doesn't work) but are missing (called missing
).我有一个样本 ID 向量,它们需要在我的数据框中(否则我应用到它们的函数不起作用)但丢失了(称为missing
)。
For each of the elements in the I want to add a row to the end of my dataframe for missing data, where I include the ID, but the rest of the data (for all the other columns) in the row is all NAs.对于 中的每个元素,我想在数据帧的末尾添加一行以查找缺失数据,其中包含 ID,但该行中的其余数据(对于所有其他列)都是 NA。
What I am currently trying, based on some other Stack Overflow posts I saw that talk only about adding empty rows, is as follows:根据我看到的其他一些 Stack Overflow 帖子,我目前正在尝试的内容仅涉及添加空行,如下所示:
for (element in missing) {
df[nrow(df)+1,] <- NA
df[nrow(df),1] <- element
}
Is there a simpler and faster way to do this, since it takes some time for even 1000 missing elements, whereas I might later have to deal with a lot more.有没有更简单快捷的方法来做到这一点,因为即使丢失 1000 个元素也需要一些时间,而我以后可能需要处理更多。
Sample data:样本数据:
samp <- data.frame(id = 1:10, val1 = 11:20, val2 = 21:30)
missing <- c(11, 13, 15)
Merge :合并:
merge(samp, data.frame(id = missing), by = "id", all = TRUE) # id val1 val2 # 1 1 11 21 # 2 2 12 22 # 3 3 13 23 # 4 4 14 24 # 5 5 15 25 # 6 6 16 26 # 7 7 17 27 # 8 8 18 28 # 9 9 19 29 # 10 10 20 30 # 11 11 NA NA # 12 13 NA NA # 13 15 NA NA
Row-bind with an external package:使用外部包进行行绑定:
data.table::rbindlist(list(samp, data.frame(id = missing)), use.names = TRUE, fill = TRUE) dplyr::bind_rows(samp, data.frame(id = missing))
Row-bind with base R, a little more work:与基 R行绑定,多一点工作:
samp0 <- samp[rep(1, length(missing)),,drop = FALSE][NA,] samp0$id <- missing rownames(samp0) <- NULL rbind(samp, samp0)
1) Using the built-in anscombe
data frame, this inserts two rows putting -1 and -3 in the x1 column. 1)使用内置的anscombe
数据框,这会插入两行,将 -1 和 -3 放在 x1 列中。
library(janitor)
new <- c(-1, -3)
add_row(anscombe, x1 = new)
giving:给予:
x1 x2 x3 x4 y1 y2 y3 y4
1 10 10 10 8 8.04 9.14 7.46 6.58
2 8 8 8 8 6.95 8.14 6.77 5.76
3 13 13 13 8 7.58 8.74 12.74 7.71
4 9 9 9 8 8.81 8.77 7.11 8.84
5 11 11 11 8 8.33 9.26 7.81 8.47
6 14 14 14 8 9.96 8.10 8.84 7.04
7 6 6 6 8 7.24 6.13 6.08 5.25
8 4 4 4 19 4.26 3.10 5.39 12.50
9 12 12 12 8 10.84 9.13 8.15 5.56
10 7 7 7 8 4.82 7.26 6.42 7.91
11 5 5 5 8 5.68 4.74 5.73 6.89
12 -1 NA NA NA NA NA NA NA
13 -3 NA NA NA NA NA NA NA
2) Here is a base solution. 2)这是一个基本的解决方案。 new
is from (1) new
的来自 (1)
(If overwriting anscombe
is ok, but typically this would make it harder to debug, then omit the first line and replace anscombe2
with anscombe
.) (如果覆盖anscombe
,但通常这会使调试变得更加困难,则省略第一行并将anscombe2
替换为anscombe
。)
anscombe2 <- anscombe
anscombe2[nrow(anscombe2) + seq_along(new), "x1"] <- new
3) Using the tibble package (or dplyr which imports this) we can use rows_insert. 3)使用tibble包(或导入它的dplyr)我们可以使用rows_insert。 new
is from (1). new
来自(1)。
library(dplyr)
rows_insert(anscombe, tibble(x1 = new))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.