Similar to: In R, select row of a matrix that match a vector
Dataframe:
structure(list(id = c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30",
"31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41",
"42", "43", "44", "45", "46", "47", "48", "49", "50"), X1 = c(8L,
18L, 6L, 10L, 2L, 12L, 20L, 19L, 17L, 6L, 20L, 3L, 14L, 20L,
11L, 17L, 19L, 3L, 12L, 17L, 20L, 14L, 11L, 1L, 9L, 1L, 2L, 4L,
7L, 18L, 7L, 12L, 18L, 6L, 6L, 6L, 20L, 11L, 17L, 15L, 6L, 2L,
17L, 14L, 15L, 10L, 4L, 4L, 7L, 3L), X2 = c(1L, 3L, 10L, 14L,
12L, 9L, 0L, 1L, 7L, 14L, 3L, 10L, 5L, 15L, 1L, 14L, 17L, 9L,
16L, 6L, 10L, 6L, 1L, 11L, 8L, 1L, 0L, 3L, 14L, 4L, 16L, 5L,
15L, 11L, 10L, 0L, 16L, 16L, 15L, 20L, 5L, 1L, 9L, 2L, 16L, 12L,
4L, 2L, 15L, 11L), X3 = c(16L, 2L, 10L, 19L, 5L, 16L, 13L, 14L,
10L, 15L, 18L, 17L, 0L, 2L, 7L, 5L, 19L, 3L, 2L, 20L, 19L, 14L,
18L, 13L, 5L, 15L, 13L, 6L, 9L, 17L, 9L, 17L, 15L, 1L, 20L, 17L,
19L, 13L, 15L, 4L, 9L, 0L, 13L, 9L, 11L, 2L, 0L, 5L, 5L, 16L),
X4 = c(14L, 16L, 6L, 2L, 2L, 10L, 13L, 5L, 9L, 16L, 15L,
3L, 11L, 8L, 2L, 17L, 1L, 1L, 5L, 18L, 0L, 14L, 18L, 19L,
6L, 17L, 15L, 11L, 19L, 13L, 2L, 12L, 8L, 4L, 17L, 14L, 9L,
18L, 10L, 19L, 14L, 14L, 15L, 15L, 7L, 16L, 2L, 19L, 12L,
13L), X5 = c(8L, 7L, 18L, 20L, 9L, 12L, 4L, 5L, 18L, 14L,
10L, 3L, 8L, 9L, 15L, 13L, 2L, 3L, 18L, 7L, 16L, 17L, 20L,
15L, 9L, 17L, 9L, 17L, 14L, 10L, 4L, 5L, 0L, 2L, 13L, 20L,
16L, 12L, 14L, 20L, 1L, 9L, 8L, 14L, 19L, 12L, 2L, 0L, 1L,
5L), X6 = c(10L, 2L, 11L, 19L, 2L, 11L, 7L, 12L, 16L, 17L,
2L, 9L, 20L, 0L, 19L, 1L, 15L, 15L, 6L, 8L, 1L, 15L, 11L,
17L, 16L, 8L, 16L, 20L, 15L, 9L, 7L, 15L, 12L, 14L, 20L,
4L, 12L, 6L, 2L, 5L, 13L, 17L, 2L, 2L, 2L, 17L, 0L, 19L,
19L, 14L), X7 = c(13L, 19L, 12L, 14L, 17L, 14L, 18L, 12L,
7L, 1L, 10L, 14L, 20L, 11L, 20L, 12L, 15L, 2L, 11L, 20L,
1L, 3L, 10L, 11L, 12L, 13L, 15L, 18L, 8L, 13L, 14L, 8L, 6L,
11L, 8L, 10L, 3L, 10L, 4L, 5L, 15L, 11L, 12L, 16L, 11L, 8L,
3L, 8L, 9L, 1L), X8 = c(7L, 17L, 7L, 17L, 17L, 6L, 18L, 11L,
14L, 17L, 1L, 4L, 18L, 9L, 15L, 20L, 12L, 8L, 5L, 20L, 6L,
15L, 8L, 3L, 12L, 1L, 14L, 12L, 6L, 0L, 8L, 13L, 20L, 0L,
20L, 20L, 13L, 9L, 0L, 17L, 1L, 2L, 15L, 10L, 2L, 1L, 20L,
11L, 15L, 11L), X9 = c(17L, 6L, 16L, 13L, 15L, 3L, 12L, 15L,
7L, 15L, 1L, 1L, 17L, 17L, 13L, 4L, 11L, 10L, 19L, 6L, 11L,
3L, 3L, 3L, 9L, 10L, 12L, 4L, 5L, 17L, 8L, 12L, 16L, 12L,
20L, 3L, 5L, 6L, 16L, 8L, 20L, 0L, 15L, 9L, 2L, 6L, 19L,
7L, 11L, 7L), X10 = c(15L, 11L, 4L, 1L, 10L, 18L, 16L, 2L,
1L, 0L, 9L, 19L, 1L, 11L, 0L, 0L, 14L, 15L, 8L, 12L, 12L,
20L, 13L, 13L, 3L, 13L, 8L, 4L, 19L, 3L, 0L, 15L, 18L, 15L,
19L, 13L, 15L, 18L, 8L, 9L, 17L, 2L, 1L, 18L, 5L, 19L, 10L,
16L, 5L, 12L)), class = "data.frame", row.names = c(NA, -50L
))
I want to select the rows where any one of x1
- x10
equals any one of a vector x <-c(0:5)
. In reality x1
- x10
has many NA
s and the vector contains entries that are made up of letters and numbers eg X1425
, 52546
, HPO1567
etc.
I know about the %in%
sign but if I wanted this to apply to multiple columns and for it to return the row if only a single vector matches only a single row what would be the best way of doing that?
My expected outcome is the whole row as it appears in the original table where any column matches any of the search vector how every many times eg multiple matches in the same row.
Ideally in base R.
Many thanks
You can try this, and then filter by Index
vec <-c(0:5)
Data$Index <- apply(Data[,-1],1,function(x) sum(x %in% vec))
id X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Index
1 1 8 1 16 14 8 10 13 7 17 15 1
2 2 18 3 2 16 7 2 19 17 6 11 3
3 3 6 10 10 6 18 11 12 7 16 4 1
4 4 10 14 19 2 20 19 14 17 13 1 2
5 5 2 12 5 2 9 2 17 17 15 10 4
6 6 12 9 16 10 12 11 14 6 3 18 1
7 7 20 0 13 13 4 7 18 18 12 16 2
8 8 19 1 14 5 5 12 12 11 15 2 4
9 9 17 7 10 9 18 16 7 14 7 1 1
10 10 6 14 15 16 14 17 1 17 15 0 2
11 11 20 3 18 15 10 2 10 1 1 9 4
12 12 3 10 17 3 3 9 14 4 1 19 5
13 13 14 5 0 11 8 20 20 18 17 1 3
14 14 20 15 2 8 9 0 11 9 17 11 2
15 15 11 1 7 2 15 19 20 15 13 0 3
16 16 17 14 5 17 13 1 12 20 4 0 4
17 17 19 17 19 1 2 15 15 12 11 14 2
18 18 3 9 3 1 3 15 2 8 10 15 5
19 19 12 16 2 5 18 6 11 5 19 8 3
20 20 17 6 20 18 7 8 20 20 6 12 0
21 21 20 10 19 0 16 1 1 6 11 12 3
22 22 14 6 14 14 17 15 3 15 3 20 2
23 23 11 1 18 18 20 11 10 8 3 13 2
24 24 1 11 13 19 15 17 11 3 3 13 3
25 25 9 8 5 6 9 16 12 12 9 3 2
26 26 1 1 15 17 17 8 13 1 10 13 3
27 27 2 0 13 15 9 16 15 14 12 8 2
28 28 4 3 6 11 17 20 18 12 4 4 4
29 29 7 14 9 19 14 15 8 6 5 19 1
30 30 18 4 17 13 10 9 13 0 17 3 3
31 31 7 16 9 2 4 7 14 8 8 0 3
32 32 12 5 17 12 5 15 8 13 12 15 2
33 33 18 15 15 8 0 12 6 20 16 18 1
34 34 6 11 1 4 2 14 11 0 12 15 4
35 35 6 10 20 17 13 20 8 20 20 19 0
36 36 6 0 17 14 20 4 10 20 3 13 3
37 37 20 16 19 9 16 12 3 13 5 15 2
38 38 11 16 13 18 12 6 10 9 6 18 0
39 39 17 15 15 10 14 2 4 0 16 8 3
40 40 15 20 4 19 20 5 5 17 8 9 3
41 41 6 5 9 14 1 13 15 1 20 17 3
42 42 2 1 0 14 9 17 11 2 0 2 6
43 43 17 9 13 15 8 2 12 15 15 1 2
44 44 14 2 9 15 14 2 16 10 9 18 2
45 45 15 16 11 7 19 2 11 2 2 5 4
46 46 10 12 2 16 12 17 8 1 6 19 2
47 47 4 4 0 2 2 0 3 20 19 10 7
48 48 4 2 5 19 0 19 8 11 7 16 4
49 49 7 15 5 12 1 19 9 15 11 5 3
50 50 3 11 16 13 5 14 1 11 7 12 3
Data
is your dput()
data. I hope this helps.
I would suggest a slightly different approach from @Duck using any
instead of sum
, so you don't need to use a filter afterwards:
vec= c(2,20)
rows = apply(data1[,-1],1,function(x) any(x %in% vec))
data1[rows,]
id X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
2 2 18 3 2 16 7 2 19 17 6 11
4 4 10 14 19 2 20 19 14 17 13 1
5 5 2 12 5 2 9 2 17 17 15 10
7 7 20 0 13 13 4 7 18 18 12 16
8 8 19 1 14 5 5 12 12 11 15 2
11 11 20 3 18 15 10 2 10 1 1 9
13 13 14 5 0 11 8 20 20 18 17 1
14 14 20 15 2 8 9 0 11 9 17 11
15 15 11 1 7 2 15 19 20 15 13 0
16 16 17 14 5 17 13 1 12 20 4 0
17 17 19 17 19 1 2 15 15 12 11 14
18 18 3 9 3 1 3 15 2 8 10 15
19 19 12 16 2 5 18 6 11 5 19 8
20 20 17 6 20 18 7 8 20 20 6 12
21 21 20 10 19 0 16 1 1 6 11 12
22 22 14 6 14 14 17 15 3 15 3 20
23 23 11 1 18 18 20 11 10 8 3 13
27 27 2 0 13 15 9 16 15 14 12 8
28 28 4 3 6 11 17 20 18 12 4 4
31 31 7 16 9 2 4 7 14 8 8 0
33 33 18 15 15 8 0 12 6 20 16 18
34 34 6 11 1 4 2 14 11 0 12 15
35 35 6 10 20 17 13 20 8 20 20 19
36 36 6 0 17 14 20 4 10 20 3 13
37 37 20 16 19 9 16 12 3 13 5 15
39 39 17 15 15 10 14 2 4 0 16 8
40 40 15 20 4 19 20 5 5 17 8 9
41 41 6 5 9 14 1 13 15 1 20 17
42 42 2 1 0 14 9 17 11 2 0 2
43 43 17 9 13 15 8 2 12 15 15 1
44 44 14 2 9 15 14 2 16 10 9 18
45 45 15 16 11 7 19 2 11 2 2 5
46 46 10 12 2 16 12 17 8 1 6 19
47 47 4 4 0 2 2 0 3 20 19 10
48 48 4 2 5 19 0 19 8 11 7 16
Here's a (fun) regex solution:
It works by first paste0
ing all rows into strings and then, using grepl
, matching the numbers from 0-5
in these rows to finally subset the dataframe df
on those rows for which
matches are returned:
df[which(grepl("(?<=^| )[0-5](?= |$)", apply(df[-1], 1, paste0, collapse = " "), perl = T)),]
id X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 1 8 1 16 14 8 10 13 7 17 15
2 2 18 3 2 16 7 2 19 17 6 11
3 3 6 10 10 6 18 11 12 7 16 4
4 4 10 14 19 2 20 19 14 17 13 1
5 5 2 12 5 2 9 2 17 17 15 10
6 6 12 9 16 10 12 11 14 6 3 18
7 7 20 0 13 13 4 7 18 18 12 16
8 8 19 1 14 5 5 12 12 11 15 2
9 9 17 7 10 9 18 16 7 14 7 1
10 10 6 14 15 16 14 17 1 17 15 0
11 11 20 3 18 15 10 2 10 1 1 9
12 12 3 10 17 3 3 9 14 4 1 19
13 13 14 5 0 11 8 20 20 18 17 1
14 14 20 15 2 8 9 0 11 9 17 11
15 15 11 1 7 2 15 19 20 15 13 0
16 16 17 14 5 17 13 1 12 20 4 0
17 17 19 17 19 1 2 15 15 12 11 14
18 18 3 9 3 1 3 15 2 8 10 15
19 19 12 16 2 5 18 6 11 5 19 8
21 21 20 10 19 0 16 1 1 6 11 12
22 22 14 6 14 14 17 15 3 15 3 20
23 23 11 1 18 18 20 11 10 8 3 13
24 24 1 11 13 19 15 17 11 3 3 13
25 25 9 8 5 6 9 16 12 12 9 3
26 26 1 1 15 17 17 8 13 1 10 13
27 27 2 0 13 15 9 16 15 14 12 8
28 28 4 3 6 11 17 20 18 12 4 4
29 29 7 14 9 19 14 15 8 6 5 19
30 30 18 4 17 13 10 9 13 0 17 3
31 31 7 16 9 2 4 7 14 8 8 0
32 32 12 5 17 12 5 15 8 13 12 15
33 33 18 15 15 8 0 12 6 20 16 18
34 34 6 11 1 4 2 14 11 0 12 15
36 36 6 0 17 14 20 4 10 20 3 13
37 37 20 16 19 9 16 12 3 13 5 15
39 39 17 15 15 10 14 2 4 0 16 8
40 40 15 20 4 19 20 5 5 17 8 9
41 41 6 5 9 14 1 13 15 1 20 17
42 42 2 1 0 14 9 17 11 2 0 2
43 43 17 9 13 15 8 2 12 15 15 1
44 44 14 2 9 15 14 2 16 10 9 18
45 45 15 16 11 7 19 2 11 2 2 5
46 46 10 12 2 16 12 17 8 1 6 19
47 47 4 4 0 2 2 0 3 20 19 10
48 48 4 2 5 19 0 19 8 11 7 16
49 49 7 15 5 12 1 19 9 15 11 5
50 50 3 11 16 13 5 14 1 11 7 12
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.