![](/img/trans.png)
[英]How to summarise time-series data with unequal number of observations with R
[英]Summarise based on number of observations per year in a time-series
我有一个长的 dataframe 像这样:
year value town
2001 0.15 ny
2002 0.19 ny
2002 0.14 ca
2001 NA ny
2002 0.15 ny
2002 0.12 ca
2001 NA ny
2002 0.13 ny
2002 0.1 ca
我想计算year
和每个species
的value
。 像这样:
df %>% group_by(year, town) %>% summarise(mean_year = mean(value, na.rm=T))
但是,我只想总结那些具有超过 2 个非 NA 值的town
值。 在上面的示例中,我不想为ny
总结 2001 年,因为它只有 1 个非 NA 值。
所以 output 会是这样的:
town year mean_year
ny 2001 NA
ny 2002 0.156
ca 2002 0.45
尝试这个
df %>% group_by(year, town) %>%
summarise(mean_year = ifelse(sum(!is.na(value))>=2, mean(value, na.rm = T), NA))
# A tibble: 3 x 3
# Groups: year [2]
year town mean_year
<int> <chr> <dbl>
1 2001 ny NA
2 2002 ca 0.12
3 2002 ny 0.157
输入
> dput(df)
structure(list(year = c(2001L, 2002L, 2002L, 2001L, 2002L, 2002L,
2001L, 2002L, 2002L), value = c(0.15, 0.19, 0.14, NA, 0.15, 0.12,
NA, 0.13, 0.1), town = c("ny", "ny", "ca", "ny", "ny", "ca",
"ny", "ny", "ca")), class = "data.frame", row.names = c(NA, -9L
))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.