[英]Limit Regex to word boundaries
I have some text 我有一些文字
"Lorem ipsum dolor sit amet, consectetuer adipiscing elit."
And I have a Regex, that is generated from user input. 我有一个正则表达式,它是根据用户输入生成的。
@".*ip.*"
This matches the whole line, as you would expect, so I wrap this expression with word boundaries. 正如您所期望的,这与整行匹配,因此我用单词边界包装了此表达式。
@"\b.*ip.*\b"
Because the processor is greedy, this still matches the whole text. 因为处理器是贪婪的,所以它仍然匹配整个文本。 So, I've tried making the repetition lazy. 因此,我尝试使重复变得懒惰。
@"\b.*?ip.*?\b"
This is better but matches 更好但是匹配
Lorem ipsum
dolor sit amet, consectetuer adipiscing
how can I extend the orginal @".*ip.*"
pattern so that it lazily matches whole words and captures? 我该如何扩展原始的@".*ip.*"
模式,使其与整个单词和捕获词完全匹配?
ipsum
adipiscing
This regex tester maybe useful for answering the question 这个正则表达式测试器可能对于回答问题很有用
Why not just use \\w*
instead of .*?
为什么不只使用\\w*
而不是.*?
: :
@"\w*ip\w*"
This will also match _
and 0-9
as it is included in \\w
. 这也将匹配_
和0-9
因为它包含在\\w
。 If you want to exclude it, you can use [a-zA-Z]*
explicitly instead of \\w
there. 如果要排除它,可以显式使用[a-zA-Z]*
代替\\w
。
You were already close to the solution. 您已经接近解决方案。 Just replace the dot (any char) by the non-whitespace escape sequence \\S: 只需用非空格转义序列\\ S替换点(任何字符):
@"\b\S*?ip\S*?\b"
我认为某些单词可以包含连字符,因此最好使用模式[\\w-]*ip[\\w-]*
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.