[英]Create a new column using dplyr based on string values in all other columns in a data frame in R
我有一个数据框my_df
:
my_df <- structure(list(C1 = c("A", "X", "X", "A", "A"), F2 = c("A", "A",
"A", "A", "A"), T3 = c("A", "A", "X", "X", "A"), S4 = c("A",
"A", "A", "A", "X"), B5 = c("A", "A", "A", "A", "A")), class = "data.frame", row.names = c("ID1",
"ID2", "ID3", "ID4", "ID5"))
> my_df
C1 F2 T3 S4 B5
ID1 A A A A A
ID2 X A A A A
ID3 X A X A A
ID4 A A X A A
ID5 A A A X A
我想创建一个新列new_col
,如果所有其他列中的所有值都相同,则表示“相同”,否则表示“差异”。 即,生成的数据框将如下所示:
> my_df
C1 F2 T3 S4 B5 new_col
ID1 A A A A A same
ID2 X A A A A diff
ID3 X A X A A diff
ID4 A A X A A diff
ID5 A A A X A diff
使用 dplyr 实现这一目标的最佳方法是什么?
library(tidyverse)
my_df <- structure(list(C1 = c("A", "X", "X", "A", "A"),
F2 = c("A", "A", "A", "A", "A"),
T3 = c("A", "A", "X", "X", "A"),
S4 = c("A", "A", "A", "A", "X"),
B5 = c("A", "A", "A", "A", "A")),
class = "data.frame",
row.names = c("ID1","ID2", "ID3", "ID4", "ID5"))
my_df %>%
rowwise() %>%
mutate(new_col = if_else(
length(unique(c_across())) == 1,
"same",
"diff"
))
#> # A tibble: 5 × 6
#> # Rowwise:
#> C1 F2 T3 S4 B5 new_col
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A A A A A same
#> 2 X A A A A diff
#> 3 X A X A A diff
#> 4 A A X A A diff
#> 5 A A A X A diff
有几种方法可以做到这一点。 一种是检查每个值是否等于第一个值:
#base R
my_df$new_col <- ifelse(rowSums(my_df == my_df[, 1]) == ncol(my_df), "same", "diff")
my_df$new_col <- ifelse(sapply(my_df, identical, my_df[, 1]), "same", "diff")
#dplyr
my_df %>%
dplyr::mutate(new_col = ifelse(rowSums(. == .[, 1]) == ncol(.), "same", "diff"))
C1 F2 T3 S4 B5 new_col
ID1 A A A A A same
ID2 X A A A A diff
ID3 X A X A A diff
ID4 A A X A A diff
ID5 A A A X A diff
您还可以检查每行唯一值的长度是否为 1:
apply(my_df, 1, function(x) length(unique(x)) == 1)
#apply(my_df, 1, function(x) dplyr::n_distinct(x) == 1)
使用uniqueN
data.table
选项:
library(data.table)
setDT(my_df)[, new_col := c("diff", "same")[(uniqueN(unlist(.SD)) == 1) + 1], 1:nrow(my_df)]
my_df
输出:
C1 F2 T3 S4 B5 new_col
1: A A A A A same
2: X A A A A diff
3: X A X A A diff
4: A A X A A diff
5: A A A X A diff
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.