简体   繁体   English

R从模式结束提取子串直到第一次出现字符

[英]R extract substring from end of pattern until first occurance of character

Struggling for hours to get this match and replace in R gsub to work and still no success. 努力争取这场比赛并在R gsub取代工作仍然没有成功。 I'm trying to match the pattern "Reason:" in a string, and extact everything AFTER this pattern and until the first occurance of a dot ( . ) For instance: 我试图在一个字符串中匹配模式"Reason:" ,并在此模式之后接触所有内容,直到第一次出现一个点( . )例如:

Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.

would return "Not interested" 会回来"Not interested"

Here's a solution: 这是一个解决方案:

s <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."

sub(".*Reason: (.*?)\\..*", "\\1", s)
# [1] "Not interested"

Update (to address comments): 更新 (发表评论):

If you also have strings that do not match the pattern, I recommend using regexpr instead of sub : 如果你还有与模式不匹配的字符串,我建议使用regexpr而不是sub

s2 <- c("no match example",
        "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE.")

match <- regexpr("(?<=Reason: ).*?(?=\\.)", s2, perl = TRUE)
ifelse(match == -1, NA, regmatches(s2, match))
# [1] NA                                "Not interested. ChannelID: CARE"

For you second example, you can use the following regex: 对于第二个示例,您可以使用以下正则表达式:

s3 <- "Delete Payment Arrangement of type Proof of Payment for BAN : 907295267 on date 02/01/2014, from reason PAERR."

# a)
sub(".*type (.*?) for.*", "\\1", s3)
# [1] "Proof of Payment"

# b)
match <- regexpr("(?<=type ).*?(?= for)", s3, perl = TRUE)
ifelse(match == -1, NA, regmatches(s3, match))
# [1] "Proof of Payment"

Lots of different ways (as you can see from the submissions). 许多不同的方式(从提交中可以看出)。 I personally like to use stringr functions. 我个人喜欢使用stringr函数。

library(stringr)

rec <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
str_match(rec, "Reason: ([a-zA-Z0-9\ ]+)\\.")[2]
## [1] "Not interested"

This will work: 这将有效:

x <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."

library(qdap)
genXtract(x, "Reason:", ".")

##     Reason:  :  . 
## " Not interested" 

with regexepr and regmatches: 使用regexepr和regmatches:

str <- "Offer Disposition. MSISDN: 7183067962. Offer: . Disposition: DECLINED. Reason: Not interested. ChannelID: CARE."
m<-regexpr("(?<=Reason: )[^.]+", str, perl=TRUE)
regmatches(str, m)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R从字符串末尾提取第一个模式 - R extract the first pattern from the end of string sed提取字符串中的子字符串和子字符串的首次出现之间的模式-获取相对路径 - sed to extract pattern between a substring and first occurance of a substring in a string - get relative path Ruby Regex:匹配直到第一次出现角色 - Ruby Regex: Match Until First Occurance of Character 从R中的字符串中提取带有点子字符串的模式 - extract a pattern with dot substring from a string in R 从具有固定的开始位置和结束点的字符串中提取R中的子字符串作为找到的字符 - Extract substring in R from string with fixed start position and end point as a character found 根据第一个字符或字符的先前出现限制模式的一部分 - Restrict part of pattern based on first character or previous occurance of character R:从第一个字符到字符串结尾的正则表达式 - R: regex from first character to the end of the string 遇到特殊字符时,从R中的字符串中提取子字符串 - Extract a substring from a string in R when coming across a special character 从一个字符开始到最后出现头盔的正则表达式匹配 - Helm regex match from the beginning until the last occurance of a character 从NSString提取模式子字符串 - Extract pattern substring from NSString
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM