简体   繁体   English

根据时间序列中每年的观察次数进行总结

[英]Summarise based on number of observations per year in a time-series

I've got a long dataframe like this:我有一个长的 dataframe 像这样:

 year   value  town
 2001   0.15   ny
 2002   0.19   ny
 2002   0.14   ca
 2001   NA     ny 
 2002   0.15   ny
 2002   0.12   ca 
 2001   NA     ny 
 2002   0.13   ny 
 2002   0.1    ca

I want to calculate a mean value per year and per species .我想计算year和每个speciesvalue Like this:像这样:

 df %>% group_by(year, town) %>% summarise(mean_year = mean(value, na.rm=T))

However, I only want to summarise those town values which have more than 2 non-NA values.但是,我只想总结那些具有超过 2 个非 NA 值的town值。 In the example above, I don't want to summarise year 2001 for ny because it only has 1 non-NA value.在上面的示例中,我不想为ny总结 2001 年,因为它只有 1 个非 NA 值。

So the output would be like this:所以 output 会是这样的:

town year mean_year  
ny   2001 NA         
ny   2002 0.156
ca   2002 0.45

try this尝试这个

df %>% group_by(year, town) %>%
  summarise(mean_year = ifelse(sum(!is.na(value))>=2, mean(value, na.rm = T), NA))

# A tibble: 3 x 3
# Groups:   year [2]
   year town  mean_year
  <int> <chr>     <dbl>
1  2001 ny       NA    
2  2002 ca        0.12 
3  2002 ny        0.157

dput输入

> dput(df)
structure(list(year = c(2001L, 2002L, 2002L, 2001L, 2002L, 2002L, 
2001L, 2002L, 2002L), value = c(0.15, 0.19, 0.14, NA, 0.15, 0.12, 
NA, 0.13, 0.1), town = c("ny", "ny", "ca", "ny", "ny", "ca", 
"ny", "ny", "ca")), class = "data.frame", row.names = c(NA, -9L
))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM