[英]Summarise multiple variables to strings in dplyr
I wish to summarize two variables in string.我想总结一下字符串中的两个变量。 Let's say this is my id假设这是我的 ID
#visit
id source1 source2
1 a t
2 c l
3 c z
1 b x
second dataset:第二个数据集:
#transaction
id transactions
1 1
3 2
1 2
I'd like to join these data together but convert them to string at the same time:我想将这些数据连接在一起,但同时将它们转换为字符串:
I can do for one variable ( let's say source 1):我可以为一个变量做(假设来源 1):
library(dplyr)
%>% left_join(visit, transaction, by="id")
%>% group_by( id)
%>% summarise( Source = toString(unique(source1)), transactions = toString(unique(transactions)) )
This gives me the following output:这给了我以下输出:
id source transactions
1 a,b 1,2
2 c NA
3 c 2
But I wish to summarize for two variables: So my desire output would be something like that:但我想总结两个变量:所以我想要的输出是这样的:
id source transactions
1 a,t > b,x 1,2
2 c,l NA
3 c,z 2
You can paste
the two variables together, using both sep
and collapse
to combine:您可以将两个变量paste
在一起,使用sep
和collapse
进行组合:
visit %>% left_join(transaction) %>%
group_by(id) %>%
summarise(source = paste(unique(source1), unique(source2), sep = ', ', collapse = ' > '),
transaction = na_if(toString(unique(na.omit(transactions))), ''))
## # A tibble: 3 × 3
## id source transaction
## <int> <chr> <chr>
## 1 1 a, t > b, x 1, 2
## 2 2 c, l <NA>
## 3 3 c, z 2
Beware, though;不过要小心; paste
and toString
stupidly coerce NA
s to strings. paste
和toString
愚蠢地将NA
强制为字符串。 You may want to wrap in na.omit
or use na_if
.您可能想要包装在na.omit
或使用na_if
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.