[英]Regex to replace all occurrences of single character within specific tokens
I would like to know if a single set of regex search/replace patterns could be used to replace all occurrences of a specific character inside a string contained within 2 tokens. 我想知道是否可以使用一组正则表达式搜索/替换模式来替换2个标记中包含的字符串内特定字符的所有出现。
For example, is it possible to replace all periods with spaces for the text between TOKEN1 & TOKEN2 as in the example below? 例如,是否可以用以下示例中的TOKEN1和TOKEN2之间的文本替换所有句点?
So that: 以便:
TOKEN1:Run.Spot.run:TOKEN2
is changed to: 更改为:
TOKEN1:Run Spot run:TOKEN2
NOTE: The regular expression would need to be capable of replacing any number of periods within any text, and not just the specific pattern above. 注意:正则表达式将需要能够替换任何文本中的任意数量的句点,而不仅仅是上面的特定模式。
I ask this question more for my personal knowledge, as it is something I have wanted to do quite a few times in the past with various regex implementations. 我问这个问题的原因是我个人的知识,因为过去我想使用各种正则表达式实现多次。 In this particular case, however, the regex would be in php.
但是,在这种特殊情况下,正则表达式将在php中。
I am not interested in php workarounds as I know how to do that. 我对php解决方法不感兴趣,因为我知道该怎么做。 I am trying to expand my knowledge of regex.
我正在尝试扩展我对正则表达式的了解。
Thanks 谢谢
A way to do this: 一种方法是:
$pattern = '~(?:TOKEN1:|\G(?<!^))(?:[^:.]+|:(?!TOKEN2))*\K\.~';
$replacement = ' ';
$subject = 'TOKEN1:Run.Spot.run:TOKEN2';
$result = preg_replace($pattern, $replacement, $subject);
pattern details: 图案细节:
~ # pattern delimiter
(?: # open a non capturing group
TOKEN1: # TOKEN1:
| # OR
\G(?<!^) # a contiguous match but not at the start of the string
) # close the non capturing group
(?: # open a non capturing group
[^:.]+ # all that is not the first character of :TOKEN2 or the searched character
| # OR
:(?!TOKEN2) # The first character of :TOKEN2 not followed by the other characters
)* # repeat the non capturing group zero or more times
\K # reset the match
\. # the searched character
~ # delimiter
The idea is to use \\G
to force each match to be TOKEN1:
or a match contiguous with the precedent match. 想法是使用
\\G
强制将每个匹配项设为TOKEN1:
或与先前匹配项相邻的匹配项。
Notice: the default behavior is like an html tag (it is always open until it is closed). 注意:默认行为类似于html标记(它始终处于打开状态直到关闭)。 If
:TOKEN2
is not found all the \\.
如果
:TOKEN2
找不到所有\\.
characters will be replaced after TOKEN1:
. 字符将在
TOKEN1:
之后TOKEN1:
。
I think the best way is to write something like this: 我认为最好的方法是写这样的东西:
$result =
preg_replace_callback(
'/(TOKEN1:)([^:]+)(:TOKEN2)/g',
function ($matches) {
return $matches[0]
. preg_replace('/[.]/g', ' ', $matches[1])
. $matches[2];
},
'TOKEN1:Run.Spot.run:TOKEN2'
);
(Disclaimer: not tested.) (免责声明:未经测试。)
At it's simplest, you would need an escaped ( \\
) period (since period usually matches any character) as your pattern : \\.
最简单的说,您需要一个转义(
\\
)期间(因为期间通常匹配任何字符)作为您的模式: \\.
, and you would replace it with a space: ,您可以将其替换为空格:
.
。
This will replace all instances of .
这将替换的所有实例
.
with 同
.
。
However, from your comment, you appear to be asking for a regex to replace all periods between word characters: 但是,从您的评论看来,您似乎要求使用正则表达式来替换单词字符之间的所有句点:
(?<=\w)\.(?=\w)
You would need a positive (zero-width noncapturing) lookbehind for a word character: (?<=\\w)
, your escaped period ( \\.
) and a positive (zero-width noncapturing) lookahead for a word character: (?=\\w)
. 对于单词字符
(?<=\\w)
,您需要后面有正号(零宽度,不捕捉),对于单词字符,您需要转义的句点( \\.
)和正值(零宽度,不捕捉): (?=\\w)
。 Replacing this with a space would have the result you want. 用空格代替它会得到您想要的结果。
If you want to replace periods only between tokens, you could prepend a positive lookbehind: (?<=TOKEN1:.+)
and append a positive lookahead: (?=.+TOKEN2), so the complete regex would be: 如果您只想替换令牌之间的句点,则可以在正则
(?<=TOKEN1:.+)
加一个(?<=TOKEN1:.+)
并追加一个正向前缀:(?=。+ TOKEN2),这样完整的正则表达式将是:
(?<=TOKEN1:.+)(?<=\w)\.(?=\w)(?=.+TOKEN2)
You may need to refine this if a period can occur immediately after the opening token and/or immediately before the closing token and you don't want to replace them. 如果在打开令牌之后立即和/或在关闭令牌之前可能会出现一个句点并且您不想替换它们,则可能需要优化此时间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.