简体   繁体   中英

R grep search patterns in multiple columns

I have a data frame like as follows:

Col1    Col2    Col3
A       B       C
D       E       F
G       H       I

I am trying to keep lines matching 'B' in 'Col2' OR F in 'Col3', in order to get:

Col1    Col2    Col3
A       B       C
D       E       F

I tried:

data[(grep("B",data$Col2) || grep("F",data$Col3)), ]

but it returns the entire data frame.

NOTE: it works when calling the 2 grep one at a time.

Or using a single grepl after paste ing the columns

df1[with(df1, grepl("B|F", paste(Col2, Col3))),]
#  Col1 Col2 Col3
#1    A    B    C
#2    D    E    F
with(df1, df1[ Col2 == 'B' | Col3 == 'F',])
#   Col1 Col2 Col3
# 1    A    B    C
# 2    D    E    F

Using grepl

with(df1, df1[ grepl( 'B', Col2) | grepl( 'F', Col3), ])
#   Col1 Col2 Col3
# 1    A    B    C
# 2    D    E    F

Data:

df1 <- structure(list(Col1 = c("A", "D", "G"), Col2 = c("B", "E", "H"
), Col3 = c("C", "F", "I")), .Names = c("Col1", "Col2", "Col3"
), row.names = c(NA, -3L), class = "data.frame")

The data.table package makes this type of operation trivial due to its compact and readable syntax. Here is how you would perform the above using data.table:

> df1 <- structure(list(Col1 = c("A", "D", "G"), Col2 = c("B", "E", "H"
+ ), Col3 = c("C", "F", "I")), .Names = c("Col1", "Col2", "Col3"
+ ), row.names = c(NA, -3L), class = "data.frame")

> library(data.table)
> DT <- data.table(df1)
> DT
   Col1 Col2 Col3
1:    A    B    C
2:    D    E    F
3:    G    H    I

> DT[Col2 == 'B' | Col3 == 'F']
   Col1 Col2 Col3
1:    A    B    C
2:    D    E    F
> 

data.table performs its matching operations with with=TRUE by default. Note that the matching is much faster if you set keys on the data but that is for another topic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM