I have some text
"Lorem ipsum dolor sit amet, consectetuer adipiscing elit."
And I have a Regex, that is generated from user input.
@".*ip.*"
This matches the whole line, as you would expect, so I wrap this expression with word boundaries.
@"\b.*ip.*\b"
Because the processor is greedy, this still matches the whole text. So, I've tried making the repetition lazy.
@"\b.*?ip.*?\b"
This is better but matches
Lorem ipsum
dolor sit amet, consectetuer adipiscing
how can I extend the orginal @".*ip.*"
pattern so that it lazily matches whole words and captures?
ipsum
adipiscing
This regex tester maybe useful for answering the question
Why not just use \\w*
instead of .*?
:
@"\w*ip\w*"
This will also match _
and 0-9
as it is included in \\w
. If you want to exclude it, you can use [a-zA-Z]*
explicitly instead of \\w
there.
You were already close to the solution. Just replace the dot (any char) by the non-whitespace escape sequence \\S:
@"\b\S*?ip\S*?\b"
我认为某些单词可以包含连字符,因此最好使用模式[\\w-]*ip[\\w-]*
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.