简体   繁体   中英

Data grouping and sub-grouping by column variable in R

I am working on data collection by R on Win7.

The given data is:

  var1    var2   value

I need to do grouping by var1 and then for each var1 , do grouping by var2.

Then, the output is column vectors of values that are associated with the same var1 and var2. Here, var1 and var2 are like keys.

Example,

   var1    var2   value
   1          56       649578   
   2          17       357835
   1          88       572397
   2          90       357289
   1          56       427352   
   2          17       498455
   1          88       354623
   2          90       678658

The result should be

   var1    var2   value
   1          56       649578   
   1          56       427352   
   1          88       354623
   1          88       572397
   2          17       357835
   2          17       498455
   2          90       357289
   2          90       678658

And, I need to print the values in a CSV file as

For var 1 as 1:

   649578   354623
   427352   572397

For var 1 as 2:

  357835   357289
  498455   678658

And, I also need to print the values in a CSV file as

For var 1 = 1:

   1          56       649578   
   1          56       427352   
   1          88       354623
   1          88       572397

For var1 = 2:

   2          17       357835
   2          17       498455
   2          90       357289
   2          90       678658

How to do it ?

I found some posts, which are not directly useful.


Update: How to choose and print the values that are associated with each unique var2 ?

Are there dictionary data structure in R ?

This is relatively close to what you are looking for I believe, but not quite the same. It should provide some help though

library(reshape2)
library(plyr)

dat<-data.frame(var1=c(1,2,1,2,1,2,1,2),var2=c(56,17,88,90,56,17,88,90),value=c(649578,357835,572397,357289,427352,498455,354623,678658))

dat<-dat[order(dat$var1,dat$var2),]

dat<-ddply(dat,.(var1,var2),summarize,seq1=c(1:length(value)),value=value)

dat.new.new<-dcast(dat,var1+var2~seq1,value.var="value")

the second dat call using order() will order the results as you requested, and the dat.new.new data frame is close to what you were looking for.

bonus points for catching the KidCudi reference

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM