[英]R- New column based on present values from other columns
I would like to create a data-frame column based on whether any of the other columns have any present values. 我想基于任何其他列是否具有任何现值来创建数据框架列。
Example: column c was created depending on whether there are any present values in the rest of the row. 示例:创建列c取决于行的其余部分中是否存在任何现有值。
age bmi hyp chl c
1 1 NA NA NA NA
2 2 22.7 1 187 1
3 1 NA 1 187 1
4 3 NA NA NA NA
5 1 20.4 1 113 1
6 3 NA NA 184 1
7 1 22.5 1 118 1
8 1 30.1 1 187 1
9 2 22.0 1 238 1
10 2 NA NA NA NA
11 1 NA NA NA NA
12 2 NA NA NA NA
13 3 21.7 1 206 1
14 2 28.7 2 204 1
15 1 29.6 1 NA 1
16 1 NA NA NA NA
17 3 27.2 2 284 1
18 2 26.3 2 199 1
19 1 35.3 1 218 1
20 3 25.5 2 NA 1
21 1 NA NA NA NA
22 1 33.2 1 229 1
23 1 27.5 1 131 1
24 3 24.9 1 NA 1
25 2 27.4 1 186 1
Column c was created using the following bit of code: 列c使用以下代码创建:
df <- transform(df, c=ifelse(!(is.na(bmi)) | !(is.na(hyp)) | !(is.na(chl)),1,NA))
My question is: How can I create a function that does the above without specifying the columns. 我的问题是:如何在不指定列的情况下创建执行上述操作的函数。 Ie if I have a dataset with 45 columns, I don't want to name all of them in the ifelse statement.
即如果我有一个包含45列的数据集,我不想在ifelse语句中命名所有这些列。
Many thanks in advance. 提前谢谢了。
We can use rowSums
on a logical matrix and then convert it to a vector
of NA and 1 我们可以在逻辑矩阵上使用
rowSums
,然后将其转换为NA和1的vector
df$c <- NA^!rowSums(!is.na(df[-1]))
df$c
#[1] NA 1 1 NA 1 1 1 1 1 NA NA NA 1 1 1 NA 1 1 1 1 NA 1 1 1 1
We can also use the coalesce
function from the dplyr
package. 我们也可以使用
dplyr
包中的coalesce
函数。
dt2 <- dt %>%
mutate_all(funs(as.numeric(.))) %>%
mutate(c = coalesce(.$bmi, .$hyp, .$chl)) %>%
mutate(c = ifelse(!is.na(c), 1, c))
dt2
age bmi hyp chl c
1 1 NA NA NA NA
2 2 22.7 1 187 1
3 1 NA 1 187 1
4 3 NA NA NA NA
5 1 20.4 1 113 1
6 3 NA NA 184 1
7 1 22.5 1 118 1
8 1 30.1 1 187 1
9 2 22.0 1 238 1
10 2 NA NA NA NA
11 1 NA NA NA NA
12 2 NA NA NA NA
13 3 21.7 1 206 1
14 2 28.7 2 204 1
15 1 29.6 1 NA 1
16 1 NA NA NA NA
17 3 27.2 2 284 1
18 2 26.3 2 199 1
19 1 35.3 1 218 1
20 3 25.5 2 NA 1
21 1 NA NA NA NA
22 1 33.2 1 229 1
23 1 27.5 1 131 1
24 3 24.9 1 NA 1
25 2 27.4 1 186 1
DATA 数据
dt <- read.table(text = " age bmi hyp chl
1 1 NA NA NA
2 2 22.7 1 187
3 1 NA 1 187
4 3 NA NA NA
5 1 20.4 1 113
6 3 NA NA 184
7 1 22.5 1 118
8 1 30.1 1 187
9 2 22.0 1 238
10 2 NA NA NA
11 1 NA NA NA
12 2 NA NA NA
13 3 21.7 1 206
14 2 28.7 2 204
15 1 29.6 1 NA
16 1 NA NA NA
17 3 27.2 2 284
18 2 26.3 2 199
19 1 35.3 1 218
20 3 25.5 2 NA
21 1 NA NA NA
22 1 33.2 1 229
23 1 27.5 1 131
24 3 24.9 1 NA
25 2 27.4 1 186",
header = TRUE, stringsAsFactors = FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.