使用dplyr根據多個條件篩選行

Question

df <- data.frame(loc.id = rep(1:2,each = 10), threshold = rep(1:10,times = 2))

我希望在threshold > = 2時篩選出第一行，並且每個loc.id threshold > = 4。 我這樣做了：

df %>% group_by(loc.id) %>% dplyr::filter(row_number() == which.max(threshold >= 2),row_number() == which.max(threshold >= 4))

我期待這樣的數據幀：

      loc.id threshold
        1       2
        1       4
        2       2
        2       4

但它給我一個空的數據幀

Answer 1

根據條件，我們可以將連接兩個which.max索引的行slice ，得到unique （如果只有閾值大於4的情況，那么兩個條件都得到相同的索引）

df %>%
    group_by(loc.id) %>%
    filter(any(threshold >= 2)) %>% # additional check
    #slice(unique(c(which.max(threshold > 2), which.max(threshold > 4))))
    # based on the expected output
    slice(unique(c(which.max(threshold >= 2), which.max(threshold >= 4))))
# A tibble: 4 x 2
# Groups:   loc.id [2]
#  loc.id threshold
#   <int>     <int>
#1      1         2
#2      1         4
#3      2         2
#4      2         4

請注意，可能存在閾值大於或等於2的組。我們只能保留這些組

Answer 2

如果這不是您想要的，請在名稱下方指定df並使用它來過濾數據集。

df %>% 
  distinct() %>% 
  filter(threshold ==2 | threshold==4)
#>   loc.id threshold
#> 1      1         2
#> 2      1         4
#> 3      2         2
#> 4      2         4
```

使用dplyr根據多個條件篩選行

問題描述

2 個解決方案

解決方案1
2 已采納 2018-06-03 15:00:32

解決方案2
1 2018-06-03 15:39:54

使用dplyr根據多個條件篩選行

問題描述

2 個解決方案

解決方案1 2 已采納 2018-06-03 15:00:32

解決方案2 1 2018-06-03 15:39:54

解決方案1
2 已采納 2018-06-03 15:00:32

解決方案2
1 2018-06-03 15:39:54