I've got a long dataframe like this:
year value town
2001 0.15 ny
2002 0.19 ny
2002 0.14 ca
2001 NA ny
2002 0.15 ny
2002 0.12 ca
2001 NA ny
2002 0.13 ny
2002 0.1 ca
I want to calculate a mean value
per year
and per species
. Like this:
df %>% group_by(year, town) %>% summarise(mean_year = mean(value, na.rm=T))
However, I only want to summarise those town
values which have more than 2 non-NA values. In the example above, I don't want to summarise year 2001 for ny
because it only has 1 non-NA value.
So the output would be like this:
town year mean_year
ny 2001 NA
ny 2002 0.156
ca 2002 0.45
try this
df %>% group_by(year, town) %>%
summarise(mean_year = ifelse(sum(!is.na(value))>=2, mean(value, na.rm = T), NA))
# A tibble: 3 x 3
# Groups: year [2]
year town mean_year
<int> <chr> <dbl>
1 2001 ny NA
2 2002 ca 0.12
3 2002 ny 0.157
dput
> dput(df)
structure(list(year = c(2001L, 2002L, 2002L, 2001L, 2002L, 2002L,
2001L, 2002L, 2002L), value = c(0.15, 0.19, 0.14, NA, 0.15, 0.12,
NA, 0.13, 0.1), town = c("ny", "ny", "ca", "ny", "ny", "ca",
"ny", "ny", "ca")), class = "data.frame", row.names = c(NA, -9L
))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.