![](/img/trans.png)
[英]R - error in separating text from a string using regex and ifelse condition
[英]Regex or condition for text in r
我想说一条文字
1) "Project:ABC is located near CBA, being too far from city "
2) "P r o j e c t : PQR is located near RQP, highlights some greenary"
我想提取单词“ project ”和“”之间的文本,以便我的输出是text1的 “ ABC is located near CBA
PQR is located near RQP
”和text2的 “ PQR is located near RQP
”,因为我使用了regex
x="Project:ABC is located near CBA, being too far from city "
sub(".*Project: *(.*?) *, .*", "\\1", x)
O\P
ABC is located near CBA
但是对于text2),它没有提供正确的输出,因此如何包含OR条件,以使我的两个条件都得到满足。 任何建议都会有所帮助。 谢谢
您可以将某些正则表达式与Lookahead和Lookbehind断言一起使用。
在一个小例子中使用stringr
包
Vec <- c("Project:ABC is located near CBA, being too far from city",
"P r o j e c t : PQR is located near RQP, highlights some greenary")
library(stringr)
str_extract(Vec, "(?<=:).*(?=,)")
#> [1] "ABC is located near CBA" " PQR is located near RQP"
如果您输入的内容比较复杂,则应调整正则表达式,因为它可能不够严格(当前,它介于first :
和last ,
)。
使您的正则表达式更加灵活: [^:]+:\\s*([^,]+),.*
> sub("[^:]+:\\s*([^,]+),.*", "\\1", "P r o j e c t : PQR is located near RQP, highlights some greenary")
[1] "PQR is located near RQP"
和
> sub("[^:]+:\\s*([^,]+),.*", "\\1", "Project:ABC is located near CBA, being too far from city ")
[1] "ABC is located near CBA"
base R
一个选项是gsub
来匹配字符( .*
)直到:
然后是零个或多个空格( \\\\s*
)或( |
)a ,
然后是其他字符,
然后将其替换为空白( ""
)
gsub(".*:\\s*|,.*", "", Vec)
#[1] "ABC is located near CBA" "PQR is located near RQP"
如果我们需要匹配Project
然后匹配:
pat <- paste0(gsub("", "\\\\s*", "Project"), ":\\s*|\\s*,.*")
gsub(pat, "", Vec)
#[1] "ABC is located near CBA" "PQR is located near RQP" "Ganga gnd A3 And 3.."
Vec <- c("Project:ABC is located near CBA, being too far from city",
"P r o j e c t : PQR is located near RQP, highlights some greenary",
"Project: Ganga gnd A3 And 3.., Plot Bearing / CTS / Survey / Final Plot No.: Sr No"
)
如果Project
字不重要:
> text
[1] "Project:ABC is located near CBA, being too far from city "
> substr(text,grep(":",strsplit(text,'')[[1]]),grep(",",strsplit(text,'')[[1]]))
[1] ":ABC is located near CBA,"
> substr(text,grep(":",strsplit(text,'')[[1]])+1,grep(",",strsplit(text,'')[[1]])-1)
[1] "ABC is located near CBA"
> text <- "P r o j e c t : PQR is located near RQP, highlights some greenary"
> substr(text,grep(":",strsplit(text,'')[[1]])+1,grep(",",strsplit(text,'')[[1]])-1)
[1] " PQR is located near RQP"
应该工作正常!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.