I have the following dataframe:
df
name direction to
<chr> <fct> <chr>
1 A -> B
2 A -> X
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 S -> T
8 T -> C
9 W -> Y
10 X -> W
11 Y NA NA
Step 1. I first want to subset the dataframe to only include values that either have X or Y in the columns name
and to
.
df %>% dplyr::select(name,direction,to) %>% filter(name %in% c('X','Y') | to %in% c('X','Y'))
name direction to
<chr> <fct> <chr>
1 A -> X
2 B -> X
3 B -> Y
4 C -> Y
5 W -> Y
6 X -> W
7 Y NA NA
Step 2. From there, I want to get any other connections that match with any of the unique values in name
from df
in Step 1. For example, the unique values in name
are A,B,C,W,X,Y after Step 1. I want to get all observations in the original dataset (without filtering) where any of these values are in the name
column from the original dataset df
. In this example, observations 1 (C->B) and 5 (A->B) from the original dataframe would be added to the subset.
Expected output:
name direction to
<chr> <fct> <chr>
1 A -> X
2 A -> B
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 W -> Y
8 X -> W
9 Y NA NA
Let me know if this doesn't make sense.
I think this should work
df %>% dplyr::select(name,direction,to) %>% filter(name %in% c('X','Y') | to %in% c('X','Y')) -> dfTmp
df[df$name %in% (dfTmp$name),]
We can use if_any
to loop over the 'name', 'to' to return a logical vector, subset the 'name' and create a logical vector with %in%
library(dplyr)
df %>%
filter(name %in% name[if_any(c(name, to), ~ . %in% c('X', 'Y' ))])%>%
as_tibble
-output
# A tibble: 9 × 3
name direction to
<chr> <chr> <chr>
1 A -> B
2 A -> X
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 W -> Y
8 X -> W
9 Y <NA> <NA>
Usually, if_any
is used in filter
to return rows when either one of the columns looped matches the condition ie here we loop over 'name', 'to', check whether the column have 'X', 'Y' for each row. If one of the column have that value, the row is returned. The if_any
returns a logical vector, so use that to subset ( [
) the 'name' elements and then create the logical vector with %in%
on the original 'name' column
df <- structure(list(name = c("A", "A", "B", "B", "C", "C", "S", "T",
"W", "X", "Y"), direction = c("->", "->", "->", "->", "->", "->",
"->", "->", "->", "->", NA), to = c("B", "X", "X", "Y", "B",
"Y", "T", "C", "Y", "W", NA)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.