[英]Regular expression, omit few words
How can I write a regular expression to match sth like that : 我该如何写一个正则表达式来匹配这样的东西:
he is capable of 他能
he is not capable of 他不 能够
etc 等等
general pattern "he is" + up to few words + "of" 一般模式“他是” +最多几个单词+“的”
I know how to solve it without regular expression, but maybe there is easier way 我知道如何不使用正则表达式来解决它,但是也许有更简单的方法
A trivial solution would be to use 一个简单的解决方案是使用
\bhe is(?: \w+){1,3} of\b
which allows between one and three "words" between he is
and of
. 它允许一个和三个“词”之间he is
和of
。
\\w+
means "a sequence of letters/digits/underscores", so it doesn't exactly match a word, but you can substitute your own word-matching regex if that one is too unspecific. \\w+
意思是“字母/数字/下划线序列”,因此它与一个单词不完全匹配,但是如果该单词太不明确,则可以替换您自己的单词匹配正则表达式。
The \\b
word boundary anchors are used to only match he
and of
and not the
or often
. \\b
词边界锚仅用于匹配he
和of
,而不匹配the
或often
。
如果您真的想检查是否有能力或没有能力。
"he is\\s+(not\\s+)?(capable\\s+)?of"
I'd go with this: 我会这样:
\bhe is\b.*\bof\b
I'm using \\b
a lot to make sure I'm matching words. 我经常使用\\b
来确保我匹配单词。 Eg this won't match She is capable of
, neither he isa wizard capable of
例如,这与She is capable of
不匹配, he isa wizard capable of
也不he isa wizard capable of
This is a little more complicated: 这有点复杂:
\bhe is\b( +\w+ *)*\bof\b
Here we have the ( +\\w+ *)*
in the middle. 在这里,中间有( +\\w+ *)*
。 This makes sure that it matches words after each other. 这样可以确保单词彼此匹配。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.