简体   繁体   English

Gsub正则表达式替换

[英]Gsub regex replacement

I am trying to do a gsub replacement in R. I would like to identify two terms from two lists separated by a single whitespace and replace it with an underscore. 我正在尝试在R中进行gsub替换。我想从两个列表中找出两个用单个空格隔开的术语,并用下划线替换。 I have successfully identified the match but I am not experienced enough in regex to understand the gsub documentation. 我已经成功地确定了匹配项,但是我对正则表达式的经验不足,无法理解gsub文档。 Can somebody help write the gsub ? 有人可以帮忙写gsub吗?

Right now I have: 现在我有:

gsub("(a|b|c)\\s+(x|y|z)","(a|b|c)_(x|y|z)",a x)

(Note: there are several places in the string that match this if that matters) (注意:如果重要的话,字符串中有多个与之匹配的位置)

I want to go from: 我想从:
ax -> a_x 斧头-> a_x
bz -> b_z bz-> b_z
hello world bx how are az you -> hello world b_x how are a_z you... and so on. hello world bx az您好吗-> hello world b_x a_z您好...等等。

Instead it does: 而是:
ax -> (a|b|c) (x|y|z) ->(a | b | c) (x | y | z)
bz -> (a|b|c) (x|y|z) ... and so on. bz->(a | b | c) (x | y | z)...等。

If anyone wants to drop a little theory in that would be appreciated but I'm working on a deadline so a simultaneous answer would be ideal. 如果有人想放弃一点理论,将不胜感激,但是我正在制定截止日期,因此同时回答将是理想的选择。

Thanks. 谢谢。

You have to use \\\\1 and \\\\2 to replace the term inside the first and second () with itself. 您必须使用\\\\1\\\\2来替换第一和第二()的术语。

vec <- "hello world b x how are a z you"

gsub("(a|b|c)\\s+(x|y|z)","\\1_\\2", vec)
# [1] "hello world b_x how are a_z you"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM