[英]R extract substring from end of pattern until first occurance of character
Struggling for hours to get this match and replace in R gsub
to work and still no success. 努力争取这场比赛并在R
gsub
取代工作仍然没有成功。 I'm trying to match the pattern "Reason:"
in a string, and extact everything AFTER this pattern and until the first occurance of a dot ( .
) For instance: 我试图在一个字符串中匹配模式
"Reason:"
,并在此模式之后接触所有内容,直到第一次出现一个点( .
)例如:
Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.
would return "Not interested"
会回来
"Not interested"
Here's a solution: 这是一个解决方案:
s <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
sub(".*Reason: (.*?)\\..*", "\\1", s)
# [1] "Not interested"
Update (to address comments): 更新 (发表评论):
If you also have strings that do not match the pattern, I recommend using regexpr
instead of sub
: 如果你还有与模式不匹配的字符串,我建议使用
regexpr
而不是sub
:
s2 <- c("no match example",
"Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.")
match <- regexpr("(?<=Reason: ).*?(?=\\.)", s2, perl = TRUE)
ifelse(match == -1, NA, regmatches(s2, match))
# [1] NA "Not interested. ChannelID: CARE"
For you second example, you can use the following regex: 对于第二个示例,您可以使用以下正则表达式:
s3 <- "Delete Payment Arrangement of type Proof of Payment for BAN : 907295267 on date 02/01/2014, from reason PAERR."
# a)
sub(".*type (.*?) for.*", "\\1", s3)
# [1] "Proof of Payment"
# b)
match <- regexpr("(?<=type ).*?(?= for)", s3, perl = TRUE)
ifelse(match == -1, NA, regmatches(s3, match))
# [1] "Proof of Payment"
Lots of different ways (as you can see from the submissions). 许多不同的方式(从提交中可以看出)。 I personally like to use
stringr
functions. 我个人喜欢使用
stringr
函数。
library(stringr)
rec <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
str_match(rec, "Reason: ([a-zA-Z0-9\ ]+)\\.")[2]
## [1] "Not interested"
This will work: 这将有效:
x <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
library(qdap)
genXtract(x, "Reason:", ".")
## Reason: : .
## " Not interested"
with regexepr and regmatches: 使用regexepr和regmatches:
str <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
m<-regexpr("(?<=Reason: )[^.]+", str, perl=TRUE)
regmatches(str, m)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.