按唯一列值分隔数据

Question

Edit to make the problem more clear: I start with a messy CSV file so I need to identify what is the ID and what is the time variable. 编辑以使问题更清楚：我从一个混乱的CSV文件开始，因此我需要确定ID是什么，时间变量是什么。 Or assign ID and time to the data in the variable columns. 或为变量列中的数据分配ID和时间。 This question has now been answered below. 现在已经在下面回答了这个问题。 Here is my data: 这是我的数据：

col1<-c("ID", "Date","var1","var2","ID","Date","var1","var2","ID","Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30", "3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<-cbind(col1,col2)

I tried with dcast() with no luck: 我没有运气就尝试了dcast() ：

dcast(ID+Date~var1+var2, data = df, value.var = col1 )

I would like the output to be a true long format like this: 我希望输出是像这样的真正的长格式：

ID<-c(1,1,2)
Date<-c("21-11-2015 14:20","21-11-2015 15:30","21-11-2015 14:20")
var1<-c("4.8","3.5","6.7")
var2<-c("3.8","5.9","3.0")
df.clean<-cbind(ID,Date, var1,var2)

I appreciate your help. 我感谢您的帮助。

Answer 1

Don't think this is a reshape question, you have values in one column and names in other which can be gathered together and given names using setNames 不要以为这是一个重塑问题，您可以在一个列中包含值，而在另一列中包含名称，可以使用setNames将它们收集在一起并指定名称

with(df, setNames(data.frame(matrix(col2,
          ncol = length(unique(col1)), byrow = TRUE)), unique(col1)))

#  ID             Date var1 var2
#1  1 21-11-2015 14:20  4.8  3.8
#2  1 21-11-2015 15:30  3.5  5.9
#3  2 21-11-2015 14:20  3.0  6.7

data 数据

col1<-c("ID", "Date","var1","var2","ID","Date","var1","var2","ID",
        "Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30", 
         "3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<- data.frame(col1,col2)

Answer 2

This is not a reshape question. 这不是重塑问题。 Here I supply a simple code on how to do it manually: 在这里，我提供了有关如何手动执行的简单代码：

Data 数据

col1<-c("ID", 
        "Date","var1","var2","ID","Date","var1","var2","ID","Date","var1","var2")
col2<-c("1","21-11-2015 14:20", "4.8","3.8", "1","21-11-2015 15:30", 
        "3.5","5.9","2","21-11-2015 14:20","3.0","6.7")
df<-data.frame(col1,col2, stringsAsFactors = F)

Code 码

uniquevars<-unique(col1)
Res<-list()
for(i in 1:length(uniquevars)){
  Res[[uniquevars[i]]]<-df[,"col2"][which(df[,"col1"] ==uniquevars[i])]
}

dfRes <- data.frame(matrix(unlist(Res), ncol=length(Res)),stringsAsFactors=FALSE)
colnames(dfRes)<-uniquevars
dfRes
      ID             Date var1 var2
    1  1 21-11-2015 14:20  4.8  3.8
    2  1 21-11-2015 15:30  3.5  5.9
    3  2 21-11-2015 14:20  3.0  6.7

I hope this code makes you understand the steps to follow on what you are interested in doing. 我希望这段代码能使您理解对您感兴趣的步骤。

Cheers ! 干杯！

Answer 3

Here's a tidyverse approach: 这是一个tidyverse方法：

library(tidyverse)

df %>%                                # your original (cbind) object
  data.frame() %>%                    # set as dataframe
  group_by(col1) %>%                  # for each col1 value
  mutate(index = row_number()) %>%    # set a row index (useful for reshaping)
  spread(col1, col2) %>%              # reshape
  select(-index)                      # remove index

# # A tibble: 3 x 4
#   Date             ID    var1  var2 
#   <fct>            <fct> <fct> <fct>
# 1 21-11-2015 14:20 1     4.8   3.8  
# 2 21-11-2015 15:30 1     3.5   5.9  
# 3 21-11-2015 14:20 2     3.0   6.7

按唯一列值分隔数据

问题描述

3 个解决方案

解决方案1
5 已采纳 2019-06-14 11:16:22

解决方案2
1 2019-06-14 11:35:28

解决方案3
1 2019-06-14 11:55:45

按唯一列值分隔数据

问题描述

3 个解决方案

解决方案1 5 已采纳 2019-06-14 11:16:22

解决方案2 1 2019-06-14 11:35:28

解决方案3 1 2019-06-14 11:55:45

解决方案1
5 已采纳 2019-06-14 11:16:22

解决方案2
1 2019-06-14 11:35:28

解决方案3
1 2019-06-14 11:55:45