[英]Multiple overlapping regex matches instead of one
Consider this string: 考虑这个字符串:
data <- "1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-I1-1-I2-1-TR-1-I1-1-I2-1-FA-1-I3-1-I1-1-FA-1-FA-1-NR-1-I3-1-I2-1-TR-1-I1-1-I2-1-I1-1-I2-1-FA-1-I2-1-I1-1-I3-1-FA-1-QU-1-I1-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-I1-1-I2-1-I1-1-NR-1-I3-1-QU-1-I2-1-I3-1-QU-1-NR-1-I2-1-I1-1-NR-1-QU-1-QU-1-I2-1-I1-1-EX"
and this regex: 这个正则表达式:
"(I3).{1,}(I3)"
This would match the section between the first I3
and the last I3
. 这将匹配第一个
I3
和最后一个I3
之间的部分。 However, how should I modify the regex to match each separate section beginning and ending with I3
? 但是,我应该如何修改正则表达式以匹配从
I3
开始和结束的每个单独部分? Eg 例如
I3-1-FA-1-I1-1-I2-1-TR-1-I1-1-I2-1-FA-1-I3
I3-1-I1-1-FA-1-FA-1-NR-1-I3
I3-1-I2-1-TR-1-I1-1-I2-1-I1-1-I2-1-FA-1-I2-1-I1-1-I3
I3-1-FA-1-QU-1-I1-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-I1-1-I2-1-I1-1-NR-1-I3
I3-1-QU-1-I2-1-I3
You can use a strsplit
with gsub
like this: 您可以像这样使用带有
gsub
的strsplit
:
data <- "1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-I1-1-I2-1-TR-1-I1-1-I2-1-FA-1-I3-1-I1-1-FA-1-FA-1-NR-1-I3-1-I2-1-TR-1-I1-1-I2-1-I1-1-I2-1-FA-1-I2-1-I1-1-I3-1-FA-1-QU-1-I1-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-I1-1-I2-1-I1-1-NR-1-I3-1-QU-1-I2-1-I3-1-QU-1-NR-1-I2-1-I1-1-NR-1-QU-1-QU-1-I2-1-I1-1-EX"
data <- gsub(".*?(I3.*?)(?=I3)","\\1I3§",data,perl=T)
strsplit(gsub("[^§]*$", "", data),"§")
The .*?(I3.*?)(?=I3)
regex (with \\\\1I3§
replacement) will remove all text before I3...I3
, add a fake symbol §
(you may use any you do not use), add a backup I3
for us to have complete I3
enclosed entries in the output, and then a second gsub
will remove the trailing unnecessary part from the string. .*?(I3.*?)(?=I3)
正则表达式(用\\\\1I3§
替换)将删除I3...I3
之前的所有文本I3...I3
,添加假符号§
(您可以使用任何不使用的) ,为我们添加一个备份I3
,在输出中包含完整的I3
封闭条目,然后第二个gsub
将从字符串中删除尾随不必要的部分。 strsplit
will do the final part - fetch you your expected results. strsplit
将做最后的部分 - 获取您的预期结果。
See IDEONE demo 请参阅IDEONE演示
Output: 输出:
[1] "I3-1-FA-1-I1-1-I2-1-TR-1-I1-1-I2-1-FA-1-I3"
[2] "I3-1-I1-1-FA-1-FA-1-NR-1-I3"
[3] "I3-1-I2-1-TR-1-I1-1-I2-1-I1-1-I2-1-FA-1-I2-1-I1-1-I3"
[4] "I3-1-FA-1-QU-1-I1-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-I1-1-I2-1-I1-1-NR-1-I3"
[5] "I3-1-QU-1-I2-1-I3"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.