[英]Summarize data table individually for multiple columns
如果可能的話,我試圖自動匯總多列中的數據,而不是為每列獨立編寫代碼。 我想總結一下:
Patch Size Achmil Aciarv Aegpod Agrcap
A 10 0 1 1 0
B 2 1 0 0 0
C 2 1 0 0 0
D 2 1 0 0 0
進入這個
Species Presence MaxSize MeanSize Count
Achmil 0 10 10 1
Achmil 1 2 2 3
Aciarv 0 2 2 3
Aciarv 1 10 10 1
我知道我可以單獨運行 group_by 並為每列匯總
achmil<-group_by(LimitArea, Achmil) %>%
summarise(SumA=mean(Size))
但是有沒有辦法使用某種循環為每個存在和不存在的每一列自動運行它? 任何幫助表示贊賞。
也許我們需要gather
到長格式然后進行summarise
library(tidyverse)
gather(df1, Species, Presence, Achmil:Agrcap) %>%
group_by(Species, Presence) %>%
summarise( MaxSize = max(Size), MeanSize = mean(Size), Count = n())
# A tibble: 7 x 5
# Groups: Species [?]
# Species Presence MaxSize MeanSize Count
# <chr> <int> <dbl> <dbl> <int>
#1 Achmil 0 10.0 10.0 1
#2 Achmil 1 2.00 2.00 3
#3 Aciarv 0 2.00 2.00 3
#4 Aciarv 1 10.0 10.0 1
#5 Aegpod 0 2.00 2.00 3
#6 Aegpod 1 10.0 10.0 1
#7 Agrcap 0 10.0 4.00 4
在較新版本的dplyr/tidyr
,我們可以使用pivot_longer
df1 %>%
pivot_longer(cols = Achmil:Agrcap, names_to = "Species",
values_to = "Presence") %>%
group_by(Species, Presence) %>%
summarise(MaxSize = max(Size), MeanSize = mean(Size), Count = n())
這里使用聚合的另一個解決方案(和reshape2::melt()
)
library(reshape2)
df = melt(df[,2:ncol(df)], "Size")
aggregate(. ~ `variable`+`value`, data = df,
FUN = function(x) c(max = max(x), mean = mean(x), count = length(x)))
variable value Size.max Size.mean Size.count
1 Achmil 0 10 10 1
2 Aciarv 0 2 2 3
3 Aegpod 0 2 2 3
4 Agrcap 0 10 4 4
5 Achmil 1 2 2 3
6 Aciarv 1 10 10 1
7 Aegpod 1 10 10 1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.