简体   繁体   English

使用 R 正则表达式识别两个字符,后跟一个破折号和两个数字

[英]Using R Regex to identify two characters followed by a dash and two numbers

Very obnoxious regex question incoming.非常讨厌的正则表达式问题传入。 I have a column that I am trying to split into two based off a condition, I'd like a new column to be created when there are two characters.我有一个列,我试图根据一个条件将其分成两列,我希望在有两个字符时创建一个新列。 followed by a dash and two numbers (eg,, CA-01 ).后跟一个破折号和两个数字(例如, CA-01 )。

My code is:我的代码是:

mydf %>% extract(col = pilot_id, regex = "[az]{2}.d{2}", into = 'facility_test')

Where the column I'd like to identify the pattern in is pilot_id , and the new column I'd like to make is facility_test .我想在其中识别模式的列是pilot_id ,而我想创建的新列是facility_test

We need to capture in extract我们需要在extract中捕获

library(dplyr)
library(tidyr)
mydf %>%
  extract(col = pilot_id,  regex = ".*-([A-Z]{2}-\\d{2})\\s.*", 
     into = 'facility_test')

# A tibble: 1 x 1
#    facility_test
#  <chr>        
#1 FL-03       

data数据

mydf <- tibble(pilot_id = "TGT Track -FL-03 (Hilsborough County) 3/3/2021")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM