R從模式結束提取子串直到第一次出現字符

Question

努力爭取這場比賽並在R gsub取代工作仍然沒有成功。 我試圖在一個字符串中匹配模式"Reason:" ，並在此模式之后接觸所有內容，直到第一次出現一個點（ . ）例如：

Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.

會回來"Not interested"

Answer 1

這是一個解決方案：

s <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."

sub(".*Reason: (.*?)\\..*", "\\1", s)
# [1] "Not interested"

更新（發表評論）：

如果你還有與模式不匹配的字符串，我建議使用regexpr而不是sub ：

s2 <- c("no match example",
        "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.")

match <- regexpr("(?<=Reason: ).*?(?=\\.)", s2, perl = TRUE)
ifelse(match == -1, NA, regmatches(s2, match))
# [1] NA                                "Not interested. ChannelID: CARE"

對於第二個示例，您可以使用以下正則表達式：

s3 <- "Delete Payment Arrangement of type Proof of Payment for BAN : 907295267 on date 02/01/2014, from reason PAERR."

# a)
sub(".*type (.*?) for.*", "\\1", s3)
# [1] "Proof of Payment"

# b)
match <- regexpr("(?<=type ).*?(?= for)", s3, perl = TRUE)
ifelse(match == -1, NA, regmatches(s3, match))
# [1] "Proof of Payment"

Answer 2

許多不同的方式（從提交中可以看出）。 我個人喜歡使用stringr函數。

library(stringr)

rec <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
str_match(rec, "Reason: ([a-zA-Z0-9\ ]+)\\.")[2]
## [1] "Not interested"

Answer 3

這將有效：

x <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."

library(qdap)
genXtract(x, "Reason:", ".")

##     Reason:  :  . 
## " Not interested"

Answer 4

使用regexepr和regmatches：

str <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
m<-regexpr("(?<=Reason: )[^.]+", str, perl=TRUE)
regmatches(str, m)

R從模式結束提取子串直到第一次出現字符

問題描述

4 個解決方案

解決方案1
6 已采納 2014-03-15 20:02:28

解決方案2
2 2014-03-15 20:09:11

解決方案3
0 2014-03-15 20:03:49

解決方案4
0 2014-03-15 20:32:04

R從模式結束提取子串直到第一次出現字符

問題描述

4 個解決方案

解決方案1 6 已采納 2014-03-15 20:02:28

解決方案2 2 2014-03-15 20:09:11

解決方案3 0 2014-03-15 20:03:49

解決方案4 0 2014-03-15 20:32:04

解決方案1
6 已采納 2014-03-15 20:02:28

解決方案2
2 2014-03-15 20:09:11

解決方案3
0 2014-03-15 20:03:49

解決方案4
0 2014-03-15 20:32:04