在R中，如何选择/设置值大于某个值的网站，然后保留所有包含小于所选值的网站？

Question

I have the following data (please let me know if the link doesn't work; it's my first time uploading to github): 我有以下数据（请让我知道链接是否无效；这是我第一次上传到github）：

https://github.com/scottr2012/test_r_data/blob/master/2017_Annual_Averages_ALL.csv https://github.com/scottr2012/test_r_data/blob/master/2017_Annual_Averages_ALL.csv

I have some data that has values for ANC. 我有一些具有ANC值的数据。 I need to select where any of the SITES have ANC > 150, but keep all years of that SITE, even if the ANC is below 150. Currently the code below removes some of the values (and years) below 150. I need all SITES where any of the years has ANC above 150. This code currently seems to only make a list of unique sites (where ANC >150 at any point), but doesn't bring over the rest of the data. 我需要选择任何SITES的ANC> 150的地方，但是要保留该SITE的所有年份，即使ANC低于150。目前下面的代码会删除一些低于150的值（和年份）。我需要所有SITES其中任何年份的ANC都高于150。此代码目前似乎仅列出唯一站点（ANC在任何时候都大于150）的列表，但不会覆盖其余数据。

vtsss <- mydata[ which(mydata$PROGRAM=='VTSSS' & mydata$ANC >= 150), ] # Pick a subset, in this case, VTSSS

unique_vtsss <- unique(vtsss$SITE)

vtsss2 <- mydata[ which(mydata[unique_vtsss]), ]

I get the following error: 我收到以下错误：

Error in `[.data.frame`(mydata, unique_vtsss) : 
  undefined columns selected

Here's where I subset the data but it still will remove some years with ANC less than 150. 这是我对数据进行子集处理的地方，但仍会删除ANC小于150的某些年份。

vtsss <- subset(mydata, PROGRAM == 'VTSSS' & ANC >= 150, 
select=c(PROGRAM, SITE, YEAR, ANC))

Answer 1

I think it should work if you replace your last line of code with 我认为如果将最后一行代码替换为

vtsss2 <- mydata[ mydata$SITE %in% unique_vtsss, ]

? ？

Answer 2

I created a small example of data which resembles your csv and I think that subsequent code does what you are asking: 我创建了一个类似于csv的小数据示例，我认为后续代码可以满足您的要求：

PROGRAM <- c('VTSSS', 'VTSSS', 'VTSSS', 'VTSSS', 'VTSSS', 'VTSSS','VTSSS','VTSSS','other') 
SITE <- c("A", "A", "A", "B", "B", "B", "C", "C", "C") 
YEAR <- c(2018, 2019, 2020, 2018, 2019, 2020, 2018, 2019, 2020) 
ANC <- c(1, 1, 1, 160, 160, 160, 1, 160, 160)
mydata <- data.frame(PROGRAM, SITE, YEAR, ANC)

vtsss <- mydata[ which(mydata$PROGRAM =='VTSSS'), ]
vtsss2 <- vtsss[ which(vtsss$ANC >= 150), ]
vtsss2 <- subset(vtsss2, !duplicated(vtsss2$SITE))
vtsss3 <- vtsss[ which(vtsss$SITE %in% vtsss2$SITE), ]

Answer 3

May be we need a group_by filter 可能是我们需要一个group_by filter

library(dplyr)
mydata %>%
   group_by(SITE) %>%
   filter(any(ANC >= 150 & !is.na(ANC) &  PROGRAM %in% "VTSSS"))

在R中，如何选择/设置值大于某个值的网站，然后保留所有包含小于所选值的网站？

问题描述

3 个解决方案

解决方案1
3 2019-08-06 14:07:23

解决方案2
2 已采纳 2019-08-06 16:01:14

解决方案3
1 2019-08-06 14:05:53

在R中，如何选择/设置值大于某个值的网站，然后保留所有包含小于所选值的网站？

问题描述

3 个解决方案

解决方案1 3 2019-08-06 14:07:23

解决方案2 2 已采纳 2019-08-06 16:01:14

解决方案3 1 2019-08-06 14:05:53

解决方案1
3 2019-08-06 14:07:23

解决方案2
2 已采纳 2019-08-06 16:01:14

解决方案3
1 2019-08-06 14:05:53