[英]join rows with same name by removing NA in r
Hello I need your support to join rows with the same name together and remove NA.您好,我需要您的支持才能将具有相同名称的行连接在一起并删除 NA。 In case of columns with the same name, a new column is created with a subscript, or combine it together with a comma.如果列具有相同名称,则使用下标创建新列,或将其与逗号组合在一起。
I have this example dataframe:我有这个例子 dataframe:
name<-c("John","John","John","Luis","Luis")
may<-c("a",NA,NA,"a",NA)
june<-c(NA,"b",NA,NA,"a")
july<-c("d",NA,"c",NA,NA)
df<-data.frame(name,may,june,july)
having the following dataframe:具有以下 dataframe:
name may june july
1 John a <NA> d
2 John <NA> b <NA>
3 John <NA> <NA> c
4 Luis a <NA> <NA>
5 Luis <NA> a <NA>
I expect a result like the following:我希望得到如下结果:
name may june july july.2
1 John a b c d
2 Luis a a <NA> <NA>
or like the following:或者像下面这样:
name may june july
1 John a b c,d
2 Luis a a <NA>
We can use summarize
to concatenate strings together under the same "name".我们可以使用summarize
将字符串连接到同一个“名称”下。
In summarize()
, if all records in the same column are NA
, we fill that record with NA
.在summarize()
中,如果同一列中的所有记录都是NA
,我们用NA
填充该记录。 If not, concatenate the strings without NA
.如果不是,则连接没有NA
的字符串。
df %>%
group_by(name) %>%
summarize(across(everything(), ~ifelse(sum(is.na(.x)) == n(), NA, paste0(na.omit(sort(.x)), collapse = ","))))
# A tibble: 2 × 4
name may june july
<chr> <chr> <chr> <chr>
1 John a b c,d
2 Luis a a NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.