繁体   English   中英

重塑R中的数据框

[英]reshape data frame in R

我有一个数据框,需要重塑形状,将单列中的重复值转换为包含多个数据列的单行。 我知道这应该很简单,但是我不知道该怎么做,以及我需要使用的许多重塑/铸造功能中的哪一个。

我的部分数据如下所示:

 Source       ID info
1     In   842701    1
2    Out   842701    1
3     In 21846591    2
4    Out 21846591    2
5     In 22181760    3
6     In 39338740    4
7    Out     9428    5

我想使它看起来像这样:

        ID In Out info
1   842701  1   1    1
2 21846591  1   1    2
3 22181760  1   0    3
4 39338740  1   0    4
5     9428  0   1    5

依此类推,同时保留所有其余列(对于给定条目而言相同)。

我真的很感谢您的帮助。 TIA。

这是使用reshape2

library(reshape2)
res <- dcast(transform(df, indx=1, ID=factor(ID, levels=unique(ID))),
                                   ID~Source, value.var="indx", fill=0)


res
#        ID In Out
#1   842701  1   1
#2 21846591  1   1
#3 22181760  1   0
#4 39338740  1   0
#5     9428  0   1

要么

res1 <- as.data.frame.matrix(table(transform(df,
                 ID=factor(ID, levels=unique(ID)))[,2:1]))

更新资料

dcast(transform(df1, indx=1, ID=factor(ID, levels=unique(ID))),
                                 ...~Source, value.var="indx", fill=0)

 #        ID info In Out
 #1   842701    1  1   1
 #2 21846591    2  1   1
 #3 22181760    3  1   0
 #4 39338740    4  1   0
 #5     9428    5  0   1

您还可以从base R使用reshape

 res2 <- reshape(transform(df1, indx=1), idvar=c("ID", "info"),
                              timevar="Source", direction="wide")

 res2[,3:4][is.na(res2)[,3:4]] <- 0
 res2
 #        ID info indx.In indx.Out
 #1   842701    1       1        1
 #3 21846591    2       1        1
 #5 22181760    3       1        0
 #6 39338740    4       1        0
 #7     9428    5       0        1

数据

df <- structure(list(Source = c("In", "Out", "In", "Out", "In", "In", 
"Out"), ID = c(842701L, 842701L, 21846591L, 21846591L, 22181760L, 
39338740L, 9428L)), .Names = c("Source", "ID"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7"))


df1 <- structure(list(Source = c("In", "Out", "In", "Out", "In", "In", 
 "Out"), ID = c(842701L, 842701L, 21846591L, 21846591L, 22181760L, 
 39338740L, 9428L), info = c(1L, 1L, 2L, 2L, 3L, 4L, 5L)), .Names = c("Source", 
 "ID", "info"), class = "data.frame", row.names = c("1", "2", 
 "3", "4", "5", "6", "7"))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM