简体   繁体   English

在r中使用gsub删除模式

[英]Removing a pattern With gsub in r

I have a string Project Change Request (PCR) - HONDA DIGITAL PLATEFORM saved in supp_matches , and supp_matches1 contains the string Project Change Request (PCR) - . 我有一个字符串Project Change Request (PCR) - HONDA DIGITAL PLATEFORM supp_matches Project Change Request (PCR) - HONDA DIGITAL PLATEFORM保存在supp_matches ,而supp_matches1包含字符串Project Change Request (PCR) -

supp_matches2 <- gsub("^.*[supp_matches1]","",supp_matches)
supp_matches2
# [1] " (PCR) - HONDA DIGITAL PLATEFORM"

Which is actually not correct but it should come like 这实际上是不正确的,但它应该像

supp_matches2
# [1] "HONDA DIGITAL PLATEFORM"

Why is it not coming the way it should be? 为什么它没有达到应有的状态?

As I say in my comment, in your expression gsub("^.*[supp_matches1]", "", supp_matches) , you're not really using the object supp_matches1 but just the letters inside it. 就像我在评论中说的那样,在您的表达式gsub("^.*[supp_matches1]", "", supp_matches) ,您实际上并没有使用对象supp_matches1而只是使用其中的字母。

You could do something like gsub(paste0("^.*", supp_matches1), "", supp_matches) to really use the expression contained in supp_matches1 , except that, as mentionned by @rawr, you have parentheses in your expression so you would need to excape them. 您可以执行类似gsub(paste0("^.*", supp_matches1), "", supp_matches)来真正使用supp_matches1包含的表达式,除了@rawr提到的那样,表达式中带有括号,这样您就可以需要为他们辩护。
The correct expression to get what you want would then be sub("Project Change Request \\\\(PCR\\\\) - ", "", supp_matches) 得到您想要的正确表达式将是sub("Project Change Request \\\\(PCR\\\\) - ", "", supp_matches)

To get what you want, you can use the fixed parameter of gsub ( sub ) function, which is saying that the expression in the parameter pattern will be matched as it is (so, without the need to escape anything, but also, no real regular expression). 要获得所需的内容,可以使用gsubsub )函数的fixed参数,这就是说参数pattern中的表达式将按原样进行匹配(因此,无需转义任何内容,而且也不需要实正则表达式)。

So what's you are looking for is : 所以您正在寻找的是:

gsub(supp_matches1, "", supp_matches, fixed=TRUE) # or just with `sub` in this case
#[1] "HONDA DIGITAL PLATEFORM"

Already @cathG provided an answer with fixed=TRUE. @cathG已经提供了fixed = TRUE的答案。 If you want to do all with regex, then you may try this. 如果您想用正则表达式来做所有事情,那么您可以尝试一下。

> w1 <- "Project Change Request (PCR) - HONDA DIGITAL PLATEFORM"
> w2 <- "Project Change Request (PCR) - "
> sub(paste0("^", gsub("(\\W)", "\\\\\\1", w2)), "", w1)
[1] "HONDA DIGITAL PLATEFORM"

It's just a kind of escaping all the special chars present inside the variable you want to use as first parameter in sub function. 这只是一种转义要用作子函数中第一个参数的变量中存在的所有特殊字符的方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM