简体   繁体   English

根据周围的字符/数值替换不同的值:gsub R.

[英]Different value to substitute depending on character / numerical values surrounding: gsub R

I have sample vector as follows: 我有如下样本载体:

vec1 <- c('3/4 in. of water', 'Indoor/Outdoor applications')

Now, I would like to replace '/' as 'by' if values surrounding '/' are numbers and as ' ' if values surrounding '/' are strings 现在,如果'/'周围的值是数字,我想将'/'替换为'by',如果'/'周围的值是字符串,我想替换为''

I know the regex to match is either: 我知道要匹配的正则表达式是:

gsub('\\d+\\/\\d+', 'by', vec1)
gsub('\\w+\\/\\w+', 'by', vec1)

However, they give the following results: 但是,它们给出了以下结果:

"by in. of water"
"by in. of water" "by applications"

I would like the result as follows: 我希望结果如下:

'3 by 4 in. of water',  'Indoor Outdoor applications'

Appreciate any inputs on how I can get these results. 感谢我对如何获得这些结果的任何意见。

Thanks! 谢谢!

You can use a PCRE regex patterns for that. 您可以使用PCRE正则表达式模式。 The (?<=\\\\d)/(?=\\\\d) matches forward slashes thst are enclosed with digits. (?<=\\\\d)/(?=\\\\d)匹配正斜杠,用数字括起来。 The /(?!\\\\d)|(?<!\\\\d)/ matches either a slash that has no digit on the right, or a slash with no digit in the left. /(?!\\\\d)|(?<!\\\\d)/匹配右侧没有数字的斜杠,或者左侧没有数字的斜杠。

Here is a solution with gsub : 这是gsub的解决方案:

> vec1 <- c('3/4 in. of water', 'Indoor/Outdoor applications') 
> gsub("/(?!\\d)|(?<!\\d)/", " ", gsub("(?<=\\d)/(?=\\d)", " by ", vec1, perl=T), perl=T)
[1] "3 by 4 in. of water"         "Indoor Outdoor applications"

You can use mgsub from the qdap package and define the pattern and replacement character vectors. 您可以使用mgsub包中的mgsub并定义模式和替换字符向量。

See sample code: 请参阅示例代码:

> library(qdap)
> vec1 <- c('3/4 in. of water', 'Indoor/Outdoor applications') 
> repl <- c(' by ', ' ') 
> patt <- c('(?<=\\d)/(?=\\d)', '/(?!\\d)|(?<!\\d)/')
> mgsub(patt, repl, vec1, fixed=FALSE, perl=T)
## [1] "3 by 4 in. of water" "Indoor Outdoor applications"

gsub('(\\\\d+)\\\\/(\\\\d+)', '\\\\1 by \\\\2', vec1)

gsub('(\\\\w+)\\\\/(\\\\w+)', '\\\\1 \\\\2', vec1)

The parentheses around the \\\\d+ and \\\\w+ mean, "capture whatever is in these parentheses so we can use it later". 围绕\\\\d+\\\\w+的括号表示“捕获这些括号中的任何内容,以便我们以后可以使用它”。

The first set of parentheses can be used later by referring to \\1 , the second set by \\2 , etc (and since we need to escape it here, make that \\\\1 and \\\\2 ). 第一组括号可以在以后通过引用\\1 ,第二组由\\2等来使用(因为我们需要在此处转义它,使其成为\\\\1\\\\2 )。

When we go on to say what we want to replace our match with, this is when we can refer to those "captured" portions of the pattern/match, as you can see us doing in the pattern at the top. 当我们继续说出我们想要替换我们的匹配时,这就是我们可以参考模式/匹配的那些“捕获”部分,因为你可以看到我们在顶部的模式中做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM