[英]double spread without hard coding
I am stuck here. 我被困在这里。 I tried using spread
twice from tidyr
, I tried joining. 我从tidyr
尝试使用spread
两次, tidyr
尝试加入。 But none of these methods give the right solution without some hard coding. 但是,如果不进行一些硬编码,这些方法都无法提供正确的解决方案。
Is there any way to tranform this data: 有什么办法可以转换这些数据:
cat1 cat2 title
1 A G AB
2 B G BC
3 C B CD
4 D G DE
5 E H EF
6 F A FG
into this: 到这个:
A B C D E F G H
AB 1 0 0 0 0 0 1 0
BC 0 1 0 0 0 0 1 0
CD 0 1 1 0 0 0 0 0
DE 0 0 0 1 0 0 1 0
EF 0 0 0 0 1 0 0 1
FG 1 0 0 0 0 1 0 0
Sample data: 样本数据:
df<-data.frame(cat1=LETTERS[1:6],
cat2=c('G','G','B','G','H','A'),
title=paste0(LETTERS[1:6],LETTERS[2:7]))
Since I usually get dplyr
answers faster: Base R or tidyr
only solutions are also very welcome 由于我通常会更快地获得dplyr
答案:非常欢迎使用Base R或tidyr
解决方案
I don't know if this qualifies as not hard coding for the op 我不知道这是否符合操作规范的要求
df %>%
tidyr::gather(key = vars, value = values, cat1, cat2) %>%
dplyr::mutate(vars = 1) %>%
tidyr::spread(key = values, value = vars, fill = 0)
# title A B C D E F G H
# 1 AB 1 0 0 0 0 0 1 0
# 2 BC 0 1 0 0 0 0 1 0
# 3 CD 0 1 1 0 0 0 0 0
# 4 DE 0 0 0 1 0 0 1 0
# 5 EF 0 0 0 0 1 0 0 1
# 6 FG 1 0 0 0 0 1 0 0
Just melt
first, then cast: 先melt
,然后投射:
require(reshape2)
melt(df, id="title") %>% dcast(title ~ value, length)
title A B C D E F G H
1 AB 1 0 0 0 0 0 1 0
2 BC 0 1 0 0 0 0 1 0
3 CD 0 1 1 0 0 0 0 0
4 DE 0 0 0 1 0 0 1 0
5 EF 0 0 0 0 1 0 0 1
6 FG 1 0 0 0 0 1 0 0
melt
puts all the values in a single column to cast. melt
将所有值放在单个列中进行转换。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.