[英]Summarise based on number of observations per year in a time-series
I've got a long dataframe like this:我有一个长的 dataframe 像这样:
year value town
2001 0.15 ny
2002 0.19 ny
2002 0.14 ca
2001 NA ny
2002 0.15 ny
2002 0.12 ca
2001 NA ny
2002 0.13 ny
2002 0.1 ca
I want to calculate a mean value
per year
and per species
.我想计算
year
和每个species
的value
。 Like this:像这样:
df %>% group_by(year, town) %>% summarise(mean_year = mean(value, na.rm=T))
However, I only want to summarise those town
values which have more than 2 non-NA values.但是,我只想总结那些具有超过 2 个非 NA 值的
town
值。 In the example above, I don't want to summarise year 2001 for ny
because it only has 1 non-NA value.在上面的示例中,我不想为
ny
总结 2001 年,因为它只有 1 个非 NA 值。
So the output would be like this:所以 output 会是这样的:
town year mean_year
ny 2001 NA
ny 2002 0.156
ca 2002 0.45
try this尝试这个
df %>% group_by(year, town) %>%
summarise(mean_year = ifelse(sum(!is.na(value))>=2, mean(value, na.rm = T), NA))
# A tibble: 3 x 3
# Groups: year [2]
year town mean_year
<int> <chr> <dbl>
1 2001 ny NA
2 2002 ca 0.12
3 2002 ny 0.157
dput输入
> dput(df)
structure(list(year = c(2001L, 2002L, 2002L, 2001L, 2002L, 2002L,
2001L, 2002L, 2002L), value = c(0.15, 0.19, 0.14, NA, 0.15, 0.12,
NA, 0.13, 0.1), town = c("ny", "ny", "ca", "ny", "ny", "ca",
"ny", "ny", "ca")), class = "data.frame", row.names = c(NA, -9L
))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.