将正则表达式限制为单词边界

Question

I have some text 我有一些文字

"Lorem ipsum dolor sit amet, consectetuer adipiscing elit."

And I have a Regex, that is generated from user input. 我有一个正则表达式，它是根据用户输入生成的。

@".*ip.*"

This matches the whole line, as you would expect, so I wrap this expression with word boundaries. 正如您所期望的，这与整行匹配，因此我用单词边界包装了此表达式。

@"\b.*ip.*\b"

Because the processor is greedy, this still matches the whole text. 因为处理器是贪婪的，所以它仍然匹配整个文本。 So, I've tried making the repetition lazy. 因此，我尝试使重复变得懒惰。

@"\b.*?ip.*?\b"

This is better but matches 更好但是匹配

Lorem ipsum
dolor sit amet, consectetuer adipiscing

how can I extend the orginal @".*ip.*" pattern so that it lazily matches whole words and captures? 我该如何扩展原始的@".*ip.*"模式，使其与整个单词和捕获词完全匹配？

ipsum
adipiscing

This regex tester maybe useful for answering the question 这个正则表达式测试器可能对于回答问题很有用

Answer 1

Why not just use \\w* instead of .*? 为什么不只使用\\w*而不是.*? : ：

@"\w*ip\w*"

This will also match _ and 0-9 as it is included in \\w . 这也将匹配_和0-9因为它包含在\\w 。 If you want to exclude it, you can use [a-zA-Z]* explicitly instead of \\w there. 如果要排除它，可以显式使用[a-zA-Z]*代替\\w 。

Answer 2

You were already close to the solution. 您已经接近解决方案。 Just replace the dot (any char) by the non-whitespace escape sequence \\S: 只需用非空格转义序列\\ S替换点（任何字符）：

@"\b\S*?ip\S*?\b"

Answer 3

我认为某些单词可以包含连字符，因此最好使用模式[\\w-]*ip[\\w-]*

将正则表达式限制为单词边界

问题描述

3 个解决方案

解决方案1
5 已采纳 2013-02-19 13:52:04

解决方案2
1 2013-02-19 13:52:56

解决方案3
1 2013-02-19 14:01:38

将正则表达式限制为单词边界

问题描述

3 个解决方案

解决方案1 5 已采纳 2013-02-19 13:52:04

解决方案2 1 2013-02-19 13:52:56

解决方案3 1 2013-02-19 14:01:38

解决方案1
5 已采纳 2013-02-19 13:52:04

解决方案2
1 2013-02-19 13:52:56

解决方案3
1 2013-02-19 14:01:38