[英]Combine two rows into one in R
我遇到了另一个挑战,即根据标识符 col 将两行合并为一行。
我的数据集如下所示:
var<-c("round","round","round","hhid","hhid","chid","chid","sex")
dfile<-c("df1","df2","df3","df1","df2","df1","df2","df1")
uniquevar<-c("df1::round","df2::round","df3::round", "df1::hhid","df2::hhid","df1::chid","df2::chid","df1::sex")
flag<-c("dup","dup","dup","dup","dup","dup","dup","NA")
df<-data.frame(var, dfile,flag)
我正在尝试做
所以,理想的结果应该是这样的
var dfile. uniquevar flag
round df1 |df2 |df3 df1::round | df2::round |df3::round dup
hhid df1 |df2 df1::hhid | df2::hhid dup
chid df1 |df2 df1::chid | df2::chid dup
sex df1 NA
目前只能在excel中手动操作,实在是太费时间了。 如果我能被告知如何在 R 中实现这一目标,我将不胜感激,考虑到数据集包含超过 600,000 个 obs,这会更快......
非常感谢~~!
您可以在使用group_by(var)
后将单元格paste
在一起。 使用sep = "::"
指定不同列之间的分隔符,使用collapse = " | "
表示行的分隔符。 您可以在dplyr
package 的summarize
中执行此操作。
library(dplyr)
df %>%
group_by(var) %>%
summarize(uniquevar = ifelse(all(flag == "dup"),
paste(dfile, var, sep = "::", collapse = " | "),
""),
dfile = paste(dfile, collapse = " | "),
dup = flag[1]) %>%
select(var, dfile, uniquevar, dup)
#> # A tibble: 4 x 4
#> var dfile uniquevar dup
#> <chr> <chr> <chr> <chr>
#> 1 chid df1 | df2 "df1::chid | df2::chid" dup
#> 2 hhid df1 | df2 "df1::hhid | df2::hhid" dup
#> 3 round df1 | df2 | df3 "df1::round | df2::round | df3::round" dup
#> 4 sex df1 "" NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.