[英]Word replacement using regex if the target word is not a part of another word
I am working on regex expression for a word to be replaced if it is a standalone word and not a part of another word. 我正在研究正则表达式,如果一个单词是独立单词而不是另一个单词的一部分,则要替换该单词。 For example the word "thing".
例如,单词“事物”。 If it is
something
, the substring "thing" should be ignored there, but if a word "thing" is preceded with a special character such as a dot or a bracket, I want it captured. 如果是
something
,则应该在此忽略子字符串“ thing”,但是如果单词“ thing”前面带有特殊字符(例如点或方括号),则希望将其捕获。 Also I want the word captured if there is a bracket, dot, or comma (or any other non-alphanumeric character is there) after it. 另外,如果要在其后面有方括号,点或逗号(或其他任何非字母数字字符),我也希望捕获该单词。
In the string 在字符串中
Something is a
thing
, and one more thingy and (thing
and morething
东西是
thing
,还有一啄和(thing
多thing
In the sentence above I highlighted the 3 words to be marked for replacement. 在上面的句子中,我突出显示了要标记为替换的3个单词。 I used the following Regex
我使用了以下正则表达式
\bthing\b
I tried out the above sentence on regex101.com and using this regex expression only the first word gotten highlighted. 我在regex101.com上尝试了上述句子,并使用此regex表达式仅突出显示了第一个单词。 I understand that my regex would not capture
(thing
but I thought it would capture the last word in the sentence so that there would be at least 2 occurrences. 我知道我的正则表达式不会捕获
(thing
但是我认为它将捕获句子中的最后一个单词,这样至少会出现两次。
Can someone please help me modify my regex expression to capture all 3 occurences in the sentence above? 有人可以帮我修改我的regex表达式以捕获上面句子中的所有3种情况吗?
You were likely using the javascript regex, which returns after the first match is found. 您可能使用了javascript正则表达式,该表达式会在找到第一个匹配项后返回。 If you add the modifier
g
in the second box on regex101.com, it will find all matches. 如果在regex101.com的第二个框中添加修饰符
g
,它将找到所有匹配项。
This site is better for C# regex testing: http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx 该站点更适合C#正则表达式测试: http : //derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
If you code this in C# and use the 'Matches' method, it should match multiple times. 如果使用C#编写此代码并使用“匹配”方法,则它应该多次匹配。
Regex regex = new Regex("\\bthing\\b");
foreach (Match match in regex.Matches(
"Something is a thing, and one more thingy and (thing and more thing"))
{
Console.WriteLine(match.Value);
}
Shorthand for alphanum [0-9A-Za-z]
is [^\\W_]
字母数字
[0-9A-Za-z]
简写为[^\\W_]
Using a lookbehind and lookahead you'd get 使用先行查找和先行查找
(?<![^\\W_])thing(?![^\\W_])
Expanded 展开式
(?<! [^\W_] ) # Not alphanum behind
thing # 'thing'
(?! [^\W_] ) # Not alphanum ahedad
Matches the highlighted text 匹配突出显示的文本
Something is a thing
, and one more thingy and ( thing
and more thing
东西是
thing
,还有一啄和( thing
多thing
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.