I have data which looks like this
data <- data.frame(
ID_num = c("BGR9876", "BNG3421", "GTH4567", "YOP9824", "Child 1", "2JAZZ", "TYH7654"),
date_created = "19/07/1983"
)
I would like to filter the dataframe so that I only keep the rows where ID_num follows the pattern ABC1234. I am new to using regular expressions in grep, and I am getting this wrong. This is what I am trying
data_clean <- data %>%
filter(grep("[A-Z]{3}[1:9]{4}", ID_num))
Which gives me the error Error in filter_impl(.data, quo) : Argument 2 filter condition does not evaluate to a logical vector
This is my desired output
data_clean <- data.frame(
ID_num = c("BGR9876", "BNG3421", "GTH4567", "YOP9824", "TYH7654"),
date_created = "19/07/1983"
)
Thanks
The 1:9
should be 1-9
and it would be grepl
along with ^
to specify the start of the string and $
for the end of the string
library(dplyr)
data %>%
filter(grepl("^[A-Z]{3}[1-9]{4}$", ID_num))
# ID_num date_created
#1 BGR9876 19/07/1983
#2 BNG3421 19/07/1983
#3 GTH4567 19/07/1983
#4 YOP9824 19/07/1983
#5 TYH7654 19/07/1983
filter
expects a logical vector, grep
returns numeric index while grepl
return logical vector
Or if we want to use grep
, use slice
which expects numeric index
data %>%
slice(grep("^[A-Z]{3}[1-9]{4}$", ID_num))
A similar option in tidyverse
would be to use str_detect
library(stringr)
data %>%
filter(str_detect(ID_num, "^[A-Z]{3}[1-9]{4}$"))
In base R
, we can do
subset(data, grepl("^[A-Z]{3}[1-9]{4}$", ID_num))
Or with Extract
data[grepl("^[A-Z]{3}[1-9]{4}$", data$ID_num),]
Note that this will specifically find the pattern of 3 upper case letters followed by 4 digits, and not match
grepl("[A-Z]{3}[1-9]{4}", "ABGR9876923")
#[1] TRUE
grepl("^[A-Z]{3}[1-9]{4}$", "ABGR9876923")
#[1] FALSE
We can use grepl
with the pattern
data[grepl("[A-Z]{3}\\d{4}", data$ID_num), ]
# ID_num date_created
#1 BGR9876 19/07/1983
#2 BNG3421 19/07/1983
#3 GTH4567 19/07/1983
#4 YOP9824 19/07/1983
#7 TYH7654 19/07/1983
Or in filter
library(dplyr)
data %>% filter(grepl("[A-Z]{3}\\d{4}", ID_num))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.