[英]How to aggregate data in R
我希望創建一個基於區域的新數據框架,並根據“年份”,“屬性類型”以及是舊的還是新的,按照每個區域的計數對數據集進行分組。
我已經嘗試了聚合函數但是丟失了其他變量的值。 以下是數據集
Property.Type Old.New Town.City District County Date
1 D N BARKING BARKING AND DAGENHAM GREATER LONDON 2012
2 D Y BARKING BARKING AND DAGENHAM GREATER LONDON 2012
3 D N BARKING BARKING AND DAGENHAM GREATER LONDON 2012
4 D N DAGENHAM BARKING AND DAGENHAM GREATER LONDON 2012
5 D N DAGENHAM BARKING AND DAGENHAM GREATER LONDON 2012
我想重新安排數據,所以我將分區作為我的ID和每個類別的不同幀,例如:
by year
District 2012 2013 2014 2015
Barking 100 500 700 800
by Old.New and year
District New Old
Barking 50 70
by property type and year
District New2012 Old2012
Barking 50 70
沒有完整的數據框,它有點難以幫助,但是這里有一些代碼塊,向您展示如何使用tidyverse
庫來聚合數據。
首先使用提供的數據重新創建數據幀:
Property.Type <- c("D","D","D","D","D")
Old.New <- c("N","Y","N","N","N")
Town.City <- c("BARKING","BARKING","BARKING","DAGENHAM","DAGENHAM")
District <- c("BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM","BARKING AND DAGENHAM")
County <- c("GREATER LONDON","GREATER LONDON","GREATER LONDON","GREATER LONDON","GREATER LONDON")
Date <- c(2012,2012,2012,2012,2012)
df <- data.frame(Property.Type,Old.New,Town.City,District,County,Date)
然后通過一些列聚合:
> df %>% group_by(Town.City) %>% summarise(n = n())
# A tibble: 2 x 2
Town.City n
<fct> <int>
1 BARKING 3
2 DAGENHAM 2
>
> df %>% group_by(Date, Town.City) %>% summarise(n = n())
# A tibble: 2 x 3
# Groups: Date [?]
Date Town.City n
<dbl> <fct> <int>
1 2012 BARKING 3
2 2012 DAGENHAM 2
>
> df %>% group_by(Date, Town.City) %>% summarise(n = n())
# A tibble: 2 x 3
# Groups: Date [?]
Date Town.City n
<dbl> <fct> <int>
1 2012 BARKING 3
2 2012 DAGENHAM 2
>
> df %>% group_by(Property.Type, Date) %>% summarise(n = n())
# A tibble: 1 x 3
# Groups: Property.Type [?]
Property.Type Date n
<fct> <dbl> <int>
1 D 2012 5
如需進一步參考,請點擊此鏈接
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.