[英]Apply function on dataframe by specific group in R
我有一個 dataframe,看起來像這樣:
dist id daytime season
3 1.11 Name1 day summer
4 2.22 Name2 night spring
5 3.33 Name1 day winter
6 4.44 Name3 night fall
我想要我的 dataframe 中某些特定列的dist
摘要。
到目前為止,我使用了自定義 function:
summary <- function(x){df %>%
group_by(x) %>%
summarize(min = min(dist),
q1 = quantile(dist, 0.25),
median = median(dist),
mean = mean(dist),
q3 = quantile(dist, 0.75),
max = max(dist))}
並將其應用於我目前想要的任何特定列:
summary_ID <- path.summary(id)
幾周前我試過了,會得到這樣的東西>
id min q1 median mean q3 max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Name1 0 17.8 310. 788. 1023. 5832.
2 Name2 0 31.7 284. 570. 744. 9578.
3 Name3 0 17.0 325. 721. 1185. 5293.
4 Name4 0 11.9 197. 530. 865. 3476.
5 Name5 0 24.5 94.9 617. 966. 9567.
當我現在嘗試時,出現錯誤:
Error in `group_by()`:
! Must group by variables found in `.data`.
✖ Column `x` is not found.
發生了什么變化,我該如何解決這個問題?
在這里,如果輸入未加引號,我們可以使用{{}}
path_summary <- function(dat, x){
dat %>%
group_by({{x}}) %>%
summarize(min = min(dist),
q1 = quantile(dist, 0.25),
median = median(dist),
mean = mean(dist),
q3 = quantile(dist, 0.75),
max = max(dist))
}
-測試
> path_summary(df, id)
# A tibble: 3 × 7
id min q1 median mean q3 max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Name1 1.11 1.66 2.22 2.22 2.78 3.33
2 Name2 2.22 2.22 2.22 2.22 2.22 2.22
3 Name3 4.44 4.44 4.44 4.44 4.44 4.44
df <- structure(list(dist = c(1.11, 2.22, 3.33, 4.44), id = c("Name1",
"Name2", "Name1", "Name3"), daytime = c("day", "night", "day",
"night"), season = c("summer", "spring", "winter", "fall")),
class = "data.frame", row.names = c("3",
"4", "5", "6"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.