简体   繁体   中英

How to aggregate data in R

I am looking to create a new data frame based on District and group the data set by counts in each district based on 'year, 'property type' and whether its old or new.

I have tried aggregate function but am losing the values for the other variables. Below is the data set

 Property.Type Old.New Town.City District             County         Date 
 1 D             N       BARKING   BARKING AND DAGENHAM GREATER LONDON 2012 
 2 D             Y       BARKING   BARKING AND DAGENHAM GREATER LONDON 2012 
 3 D             N       BARKING   BARKING AND DAGENHAM GREATER LONDON 2012 
 4 D             N       DAGENHAM  BARKING AND DAGENHAM GREATER LONDON 2012 
 5 D             N       DAGENHAM  BARKING AND DAGENHAM GREATER LONDON 2012 

I would like to re arrange the data so I have district as my ID and different frames for each category eg:

by year
District 2012 2013 2014 2015
Barking  100  500  700 800

by Old.New and year 

District New  Old
Barking  50    70

by property type and year
District New2012  Old2012
Barking  50    70

without the full dataframe it's a bit hard to help, however here's some chunks of code that shows you how to use the tidyverse library to aggregate data.

First recreate a dataframe with the provided data:

Property.Type <- c("D","D","D","D","D")
Old.New <- c("N","Y","N","N","N")
Town.City <- c("BARKING","BARKING","BARKING","DAGENHAM","DAGENHAM")
District <- c("BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM")
County <- c("GREATER LONDON","GREATER LONDON","GREATER LONDON","GREATER LONDON","GREATER LONDON")
Date <- c(2012,2012,2012,2012,2012)    
df <- data.frame(Property.Type,Old.New,Town.City,District,County,Date)

Then aggregate by some columns:

> df %>% group_by(Town.City) %>% summarise(n = n())
# A tibble: 2 x 2
  Town.City     n
  <fct>     <int>
1 BARKING       3
2 DAGENHAM      2
> 
> df %>% group_by(Date, Town.City) %>% summarise(n = n())
# A tibble: 2 x 3
# Groups:   Date [?]
   Date Town.City     n
  <dbl> <fct>     <int>
1  2012 BARKING       3
2  2012 DAGENHAM      2
> 
> df %>% group_by(Date, Town.City) %>% summarise(n = n())
# A tibble: 2 x 3
# Groups:   Date [?]
   Date Town.City     n
  <dbl> <fct>     <int>
1  2012 BARKING       3
2  2012 DAGENHAM      2
> 
> df %>% group_by(Property.Type, Date) %>% summarise(n = n())
# A tibble: 1 x 3
# Groups:   Property.Type [?]
  Property.Type  Date     n
  <fct>         <dbl> <int>
1 D              2012     5

For further reference follow this link .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM