简体   繁体   English

替换字符串中的模式

[英]Replacing patterns in a string

I have several strings in this format. 我有几种这种格式的字符串。 The separator is a dash ( - ) and each "thing" in between is a marker. 分隔符是一个破折号( - ),中间的每个“事物”都是一个标记。

string <- "FA-I2-I2-I2-EX-I2-I3-FA-I1-I2-TR-I1-I2-FA-I3-I1-FAFANR-I3-I2-TR-I1-I2-I1-I2-FA-I2-I1-I3-FAQU-I1-I2-I2-I2-NR-I2-I2-NR-I1-I2-I1-NR-I3-QU-I2-I3-QUNR-I2-I1-NRQUQU-I2-I1-EX"

I want to identify cases wherever markers containing the letter "I" occurs in a row (ie the markers I1, I2, and I3). 我想确定行中包含字母“ I”的标记(即标记I1,I2和I3)的情况。 Then I want to replace those with a description that has no separators. 然后,我想用没有分隔符的描述替换那些描述。 For example, the very beginning of the string should be converted as follows: 例如,字符串的开头应按以下方式转换:

FA-I2I2I2-EX

So basically all I want to do is to remove all the dashes between markers containing "I". 因此,基本上我想要做的就是删除包含“ I”的标记之间的所有破折号。

Here's a somewhat convoluted solution: 这是一个令人费解的解决方案:

string1 <- gsub(string, pattern = "I1", replacement = "ZI1Z")
string2 <- gsub(string1, pattern = "I2", replacement = "ZI2Z")
string3 <- gsub(string2, pattern = "I3", replacement = "ZI3Z")
string4 <- gsub(string3, pattern = "Z-Z", replacement = "")
string5 <- gsub(string4, pattern = "Z", replacement = "")

which gives: 这使:

"FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"

Is there a more elegant way of accomplishing this? 有没有更优雅的方法可以做到这一点?

So basically all I want to do is to remove all the dashes between markers containing "I". 因此,基本上我想要做的就是删除包含“ I”的标记之间的所有破折号。

You can use lookaround assertions if your case is as simple as it sounds. 如果您的案例听起来很简单,则可以使用环视断言。

gsub('(?<=I\\d)-(?=I\\d)', '', string, perl = TRUE)
# [1] "FA-I2I2I2-EX-I2I3-FA-I1I2-TR-I1I2-FA-I3I1-FAFANR-I3I2-TR-I1I2I1I2-FA-I2I1I3-FAQU-I1I2I2I2-NR-I2I2-NR-I1I2I1-NR-I3-QU-I2I3-QUNR-I2I1-NRQUQU-I2I1-EX"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM