[英]How to use anti_join with different levels of two variables?
我已經嘗試了幾個小時,但我無法弄清楚。 我有一個包含主題和條件df1
的數據框,我想從中排除具有特定值的觀察值( df2
的變量“值”中小於 3。我無法使其工作,因為我需要從df1
刪除組合兩個變量的不同水平。
這是 df1:
df1 <- structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L),
condition = c("A", "A", "A", "B", "B", "B", "C", "C","C", "A", "A",
"A", "B", "B", "B", "C", "C", "C", "A", "A", "A","B", "B", "B", "C", "C", "C")),
row.names = c(NA, -27L), class = c("tbl_df", "tbl", "data.frame"))
這是 df2
df2 <- structure(list(subject = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,4L, 4L, 4L, 5L, 5L, 5L),
condition = c("A", "B", "C", "A", "B","C", "A", "B", "C", "A", "B", "C", "A", "B", "C"),
value = c(10L, 8L, 7L, 3L, 8L, 5L, 3L, 3L, 9L, 8L, 7L, 8L, 10L, 6L, 2L)),
row.names = c(NA,-15L), class = c("tbl_df", "tbl", "data.frame"))
我想在df1
刪除值小於 3 的所有主題和條件的組合,因此這將是最終的 df:
df3 <- structure(list(subject = c(2L, 3L, 3L, 5L),
condition = c("A","A", "B", "C")),
row.names = c(NA, -4L),
class = c("tbl_df","tbl", "data.frame"))
到目前為止,我一直這樣做,但我不能了,因為我有數百行......
df3 <- df1 %>% filter(!(subject==2 & condition=="A" |
subject==3 & (condition=="A" | condition=="B") |
subject==5 & condition=="C"))
您的df3
示例結果與您用來派生它的代碼沖突,因此這里是一個dplyr
解決方案,用於對您想要的df3
每種解釋。
注意:這兩種結果只有在您
...排除具有特定值(來自 df2.x 的變量“值”中小於[或等於] 3 的觀察值)。
所以我使用不等式<= 3
而不是< 3
來實現這些解決方案。
df3
第一個解釋獲取df3
的版本
# A tibble: 4 x 2
subject condition
<int> <chr>
1 2 A
2 3 A
3 3 B
4 5 C
您在此處提供的示例結果
我想在 df1 中刪除值低於 3 的所有主題和條件的組合,因此這將是最終的 df :
df3 <- structure(list(subject = c(2L, 3L, 3L, 5L), condition = c("A","A", "B", "C")), row.names = c(NA, -4L), class = c("tbl_df","tbl", "data.frame"))
只需在df2
上使用filter()
:
library(dplyr)
# ...
# Code to generate 'df1' and 'df2'.
# ...
df3 <- df2 %>% filter(value <= 3)
df3
第二個解釋但是,我看來您實際上需要以下版本的df3
# A tibble: 18 x 2
subject condition
<int> <chr>
1 1 A
2 1 A
3 1 A
4 1 B
5 1 B
6 1 B
7 1 C
8 1 C
9 1 C
10 2 B
11 2 B
12 2 B
13 2 C
14 2 C
15 2 C
16 3 C
17 3 C
18 3 C
你在這里得出的:
df3 <- df1 %>% filter(!(subject==2 & condition=="A" |
subject==3 & (condition=="A" |condition=="B") |
subject==5 & condition=="C"))
在這種情況下,你應該anti_join()
你的df1
到一個filter()
ed 版本的df2
:
library(dplyr)
# ...
# Code to generate 'df1' and 'df2'.
# ...
df3 <- df1 %>%
anti_join(df2 %>% filter(value <= 3), by = c("subject", "condition"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.