[英]Separating data by unique column values
Edit to make the problem more clear: I start with a messy CSV file so I need to identify what is the ID and what is the time variable. 编辑以使问题更清楚:我从一个混乱的CSV文件开始,因此我需要确定ID是什么,时间变量是什么。 Or assign ID and time to the data in the variable columns.
或为变量列中的数据分配ID和时间。 This question has now been answered below.
现在已经在下面回答了这个问题。 Here is my data:
这是我的数据:
col1<-c("ID", "Date","var1","var2","ID","Date","var1","var2","ID","Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30", "3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<-cbind(col1,col2)
I tried with dcast()
with no luck: 我没有运气就尝试了
dcast()
:
dcast(ID+Date~var1+var2, data = df, value.var = col1 )
I would like the output to be a true long format like this: 我希望输出是像这样的真正的长格式:
ID<-c(1,1,2)
Date<-c("21-11-2015 14:20","21-11-2015 15:30","21-11-2015 14:20")
var1<-c("4.8","3.5","6.7")
var2<-c("3.8","5.9","3.0")
df.clean<-cbind(ID,Date, var1,var2)
I appreciate your help. 我感谢您的帮助。
Don't think this is a reshape question, you have values in one column and names in other which can be gathered together and given names using setNames
不要以为这是一个重塑问题,您可以在一个列中包含值,而在另一列中包含名称,可以使用
setNames
将它们收集在一起并指定名称
with(df, setNames(data.frame(matrix(col2,
ncol = length(unique(col1)), byrow = TRUE)), unique(col1)))
# ID Date var1 var2
#1 1 21-11-2015 14:20 4.8 3.8
#2 1 21-11-2015 15:30 3.5 5.9
#3 2 21-11-2015 14:20 3.0 6.7
data 数据
col1<-c("ID", "Date","var1","var2","ID","Date","var1","var2","ID",
"Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30",
"3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<- data.frame(col1,col2)
This is not a reshape question. 这不是重塑问题。 Here I supply a simple code on how to do it manually:
在这里,我提供了有关如何手动执行的简单代码:
Data 数据
col1<-c("ID",
"Date","var1","var2","ID","Date","var1","var2","ID","Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30",
"3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<-data.frame(col1,col2, stringsAsFactors = F)
Code 码
uniquevars<-unique(col1)
Res<-list()
for(i in 1:length(uniquevars)){
Res[[uniquevars[i]]]<-df[,"col2"][which(df[,"col1"] ==uniquevars[i])]
}
dfRes <- data.frame(matrix(unlist(Res), ncol=length(Res)),stringsAsFactors=FALSE)
colnames(dfRes)<-uniquevars
dfRes
ID Date var1 var2
1 1 21-11-2015 14:20 4.8 3.8
2 1 21-11-2015 15:30 3.5 5.9
3 2 21-11-2015 14:20 3.0 6.7
I hope this code makes you understand the steps to follow on what you are interested in doing. 我希望这段代码能使您理解对您感兴趣的步骤。
Cheers ! 干杯!
Here's a tidyverse
approach: 这是一个
tidyverse
方法:
library(tidyverse)
df %>% # your original (cbind) object
data.frame() %>% # set as dataframe
group_by(col1) %>% # for each col1 value
mutate(index = row_number()) %>% # set a row index (useful for reshaping)
spread(col1, col2) %>% # reshape
select(-index) # remove index
# # A tibble: 3 x 4
# Date ID var1 var2
# <fct> <fct> <fct> <fct>
# 1 21-11-2015 14:20 1 4.8 3.8
# 2 21-11-2015 15:30 1 3.5 5.9
# 3 21-11-2015 14:20 2 3.0 6.7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.