[英]Regex: Replacing all spaces between two characters
Consider the following string: This is an example: this is another one, and this is yet another, and other, and so on.
考虑以下字符串:
This is an example: this is another one, and this is yet another, and other, and so on.
字符串, This is an example: this is another one, and this is yet another, and other, and so on.
字符串This is an example: this is another one, and this is yet another, and other, and so on.
I want to replace all space characters between :
and ,
.我想替换
:
和,
之间的所有空格字符。 So it would look like this This is an example:_this_is_another_one, and this is yet another, and other, and so on.
所以它看起来像这样
This is an example:_this_is_another_one, and this is yet another, and other, and so on.
What I've tried so far:到目前为止我尝试过的:
(?<=:)\\s+(?=[^,]*,)
(only matches the first space) (?<=:)\\s+(?=[^,]*,)
(只匹配第一个空格):\\s+(?=[^:,]*,)
(Same as above) :\\s+(?=[^:,]*,)
(同上)\\s+(?=[^:,]*,)
(Matches This is an example:_this_is_another_one,_and_this_is_yet_another,_and_other, and so on
) \\s+(?=[^:,]*,)
(匹配This is an example:_this_is_another_one,_and_this_is_yet_another,_and_other, and so on
)Update : there is a simple way to replace anything in between arbitrary strings in R using stringr::str_replace_all
using an anonymous function as the replacement argument:更新:有一种简单的方法可以使用
stringr::str_replace_all
使用匿名函数作为替换参数来替换 R 中任意字符串之间的任何内容:
Generic stringr
approach通用
stringr
方法
library(stringr)
# left - left boundary
# right - right boundary
# x - input
# what - regex pattern to search for inside matches
# repl - replacement text for the in-pattern matches
ReplacePatternBetweenTwoStrings <- function(left, right, x, what, repl) {
left <- gsub("([][{}()+*^${|\\\\?.])", "\\\\\\1", left)
right <- gsub("([][{}()+*^${|\\\\?.])", "\\\\\\1", right)
str_replace_all(x,
paste0("(?s)(?<=", left, ").*?(?=", right, ")"),
function(z) gsub(what, repl, z)
)
}
x <- "This is an example: this is another one, and this is yet another, and other, and so on."
ReplacePatternBetweenTwoStrings(":", ",", x, "\\s+", "_")
## => [1] "This is an example:_this_is_another_one, and this is yet another, and other, and so on."
See this R demo .请参阅此 R 演示。
Replacing all whitespaces between the closest :
and ,
替换最近的
:
和 之间的所有空格,
This is a simple edge case of the above when :[^:,]+,
matches a :
, then any amount of chars other than :
and ,
(the delimiter chars) and then a ,
, then the whitespaces are replaced with underscores in the matches only:这是上面的一个简单的边缘情况,当
:[^:,]+,
匹配 a :
,然后是除:
和,
(分隔符字符)之外的任意数量的字符,然后是 a ,
,然后空格被替换为下划线仅比赛:
stringr::str_replace_all(x, ":[^:,]+,", function(z) gsub("\\s+", "_", z))
See the regex demo查看正则表达式演示
Original answer (scales rather poorly)原始答案(比例相当差)
You may use the following regex:您可以使用以下正则表达式:
(?:\G(?!^)|:)[^,]*?\K\s(?=[^,]*,)
Replace with _
.替换为
_
。 See the regex demo .请参阅正则表达式演示。
Details详情
(?:\\G(?!^)|:)
- the end of the previous match ( \\G(?!)^
) or a colon (?:\\G(?!^)|:)
- 前一个匹配的结尾( \\G(?!)^
)或冒号[^,]*?
- any 0+ chars other than ,
as few as possible ,
尽可能少\\K
- match reset operator discarding the text matched so far \\K
- 匹配重置运算符丢弃到目前为止匹配的文本\\s
- a whitespace \\s
- 一个空格(?=[^,]*,)
- a positive lookahead check that makes sure there is a ,
after zero or more chars other than a comma. (?=[^,]*,)
- 一个积极的前瞻检查,确保在零个或多个除逗号之外的字符之后有一个,
。re <- "(?:\\G(?!^)|:)[^,]*?\\K\\s(?=[^,]*,)"
x <- "This is an example: this is another one, and this is yet another, and other, and so on."
gsub(re, "_", x, perl=TRUE)
# => [1] "This is an example:_this_is_another_one, and this is yet another, and other, and so on."
Here is a slightly gross answer:这是一个略显粗糙的答案:
txt="This is an example: this is another one, and this is yet"
split_str=unlist(strsplit(gsub("^(.*:)(.*)(,.*)", "\\1$\\2$\\3", txt), split="$", fixed=T))
paste0(split_str[1], gsub(" ", "_",split_str[2]), split_str[3])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.