简体   繁体   中英

Formatting data in R categorical

Let

reason <- c("v","v","v","v","v","s","s","s","v","v","v","s","s")
location <- c("c","c","c","c","c","c","c","c","h","h","h","h","h")
zero_one <- c(1,1,0,1,1,1,1,0,1,0,0,1,0)  
df <- data.frame(reason, location, zero_one)

Is there an easy way to convert "df" to "DF", where "DF" has the following shape:

reason  location  #zeros  #ones  
     v         c       1      4  
     s         c       1      2  
     v         h       2      1  
     s         h       1      1

You could do this using dcast

library(reshape2)
dcast(transform(df, zero_one= factor(zero_one, levels=0:1,
  labels=c('zeros', 'ones'))), ...~zero_one, value.var='zero_one', length)
#   reason location zeros ones
#1      s        c     1    2
#2      s        h     1    1
#3      v        c     1    4
#4      v        h     2    1

Or using data.table (similar approach as @jalapic's)

setDT(df)[,list(zeros=sum(!zero_one), ones=sum(!!zero_one)),
            .(reason, location)][]
#   reason location zeros ones
#1:      v        c     1    4
#2:      s        c     1    2
#3:      v        h     2    1
#4:      s        h     1    1

Or in base R

 aggregate(cbind(zeros=!zero_one, ones=!!zero_one)~., df, FUN= sum)
 #  reason location zeros ones
 #1      s        c     1    2
 #2      v        c     1    4
 #3      s        h     1    1
 #4      v        h     2    1

You can do it with dplyr very simply:

library(dplyr)
df %>% 
 group_by(reason,location) %>% 
 summarize(zeros = sum(zero_ones==0), ones = sum(zero_ones==1))

#  reason location zeros ones
#1      s        c     1    2
#2      s        h     1    1
#3      v        c     1    4
#4      v        h     2    1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM