简体   繁体   English

R正则表达式匹配字符串的开头和结尾,忽略中间

[英]R regex to match beginning and end of string, ignoring middle

In R, how can I create the regex that matches beginning and end strings, ignoring everything between? 在R中,如何创建匹配开始和结束字符串的正则表达式,而忽略两者之间的所有内容?

Specifically, how can I grep out of the following, the strings that begin with "./xl/worksheets" and end with ".xml"? 具体来说,如何从以下字符串中查找以“ ./xl/worksheets”开头并以“ .xml”结尾的字符串?

myfiles <- c("./_rels/.rels", "./xl/_rels/workbook.xml.rels", 
"./xl/workbook.xml", "./xl/worksheets/sheet4.xml", 
"./xl/worksheets/_rels/sheet1.xml.rels", "./xl/worksheets/sheet2.xml", 
"./xl/printerSettings/printerSettings11.bin")

I succeed with 我成功了

grep("^\\./xl/worksheets", myfiles) # returns 4 5 6
grep("\\.xml$", myfiles) # returns 3 4 6

And of course I can do this: 当然,我可以这样做:

which(grepl("^\\./xl/worksheets", myfiles) &
  grepl("\\.xml$", myfiles)) # returns 4 6

But, I can't figure how to make the wildcard between two patterns. 但是,我不知道如何在两种模式之间进行通配符处理。

Simply adding a match all pattern .* between the start and end should work: 只需在开始和结束之间添加一个match all模式.*

grep("^\\./xl/worksheets.*\\.xml$", myfiles) 
# [1] 4 6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM