[英]R Reshape data by combining common value of two variables
I want to reshape a data frame by combining two variables. 我想通过合并两个变量来重塑数据框。 For example: 例如:
Here is a new data: 这是一个新数据:
dat = data.frame(
var1 = c("a", "a", "a", "Emily", "b", "Bob", "c"),
var2 = c("Jhon", "Emily", "Julie", "Angela", "Bob", "Paul", "Paul"),
stringsAsFactors = F
)
Excepted output: 排除的输出:
# var1 var2 var3 var4 var5
# 1 a Jhon Emily Julie Angela
# 2 b Bob Paul c <NA>
Using base R you can do: 使用base R,您可以执行以下操作:
relation=function(dat){
.relation=function(x){
k = unique(sort(c(dat[dat[, 1] %in% x, 2], x, dat[dat[, 2] %in% x, 1])))
if(setequal(x,k)) toString(k) else .relation(k)}
grp = sapply(unique(dat[,1]), .relation)
read.table(text = unique(grp), fill=T, sep=",")
}
relation(dat)
V1 V2 V3 V4 V5
1 a Angela Emily Jhon Julie
2 b Bob c Paul
dat = data.frame(var1 = c("a", "a", "a", "Emily", "b", "Bob"),
var2 = c("Jhon", "Emily", "Julie", "Angela", "Bob", "Paul"))
library(igraph)
g <- graph_from_data_frame(dat)
plot(g)
starts <- V(g)[degree(g, mode = "in") == 0]
finals <- V(g)[degree(g, mode = "out") == 0]
res <- lapply(starts, function(x) unique(names(unlist(all_simple_paths(g,
from = x,
to = finals,
mode = "out")))))
res
# matrix/data frame (?)
max_len <- max(sapply(res, length))
data.frame(do.call(rbind, lapply(res, function(x) c(x, rep(NA, max_len - length(x))))))
I have produced a solution which first "cleans" the data structure and then reshapes it using dcast
. 我提出了一种解决方案,该解决方案首先“清除”数据结构,然后使用dcast
对其进行dcast
。
library(data.table)
dt.dat <- data.table(dat)
# Cleaning the dataset by adding the persons not assigned to a group by the connection over names
dt.dat.complete <-
rbindlist(list(dt.dat[!(var1 %in% merge(dt.dat, dt.dat, by.x = "var2", by.y = "var1")[,var2]),]
,
merge(dt.dat, dt.dat, by.x = "var2", by.y = "var1")[, .(var1, var2.y)]
))
# Add sequence for the column names
dt.dat.complete[,seq := seq_len(.N),
by=var1]
dcast.data.table(dt.dat.complete, var1 ~ paste0("col",seq) + seq,fun.aggregate = NULL, value.var = "var2")
var1 col1_1 col2_2 col3_3 col4_4
1: a Jhon Emily Julie Angela
2: b Bob Paul <NA> <NA>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.