![](/img/trans.png)
[英]How to summarise time-series data with unequal number of observations with R
[英]Summarise based on number of observations per year in a time-series
我有一個長的 dataframe 像這樣:
year value town
2001 0.15 ny
2002 0.19 ny
2002 0.14 ca
2001 NA ny
2002 0.15 ny
2002 0.12 ca
2001 NA ny
2002 0.13 ny
2002 0.1 ca
我想計算year
和每個species
的value
。 像這樣:
df %>% group_by(year, town) %>% summarise(mean_year = mean(value, na.rm=T))
但是,我只想總結那些具有超過 2 個非 NA 值的town
值。 在上面的示例中,我不想為ny
總結 2001 年,因為它只有 1 個非 NA 值。
所以 output 會是這樣的:
town year mean_year
ny 2001 NA
ny 2002 0.156
ca 2002 0.45
嘗試這個
df %>% group_by(year, town) %>%
summarise(mean_year = ifelse(sum(!is.na(value))>=2, mean(value, na.rm = T), NA))
# A tibble: 3 x 3
# Groups: year [2]
year town mean_year
<int> <chr> <dbl>
1 2001 ny NA
2 2002 ca 0.12
3 2002 ny 0.157
輸入
> dput(df)
structure(list(year = c(2001L, 2002L, 2002L, 2001L, 2002L, 2002L,
2001L, 2002L, 2002L), value = c(0.15, 0.19, 0.14, NA, 0.15, 0.12,
NA, 0.13, 0.1), town = c("ny", "ny", "ca", "ny", "ny", "ca",
"ny", "ny", "ca")), class = "data.frame", row.names = c(NA, -9L
))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.