简体   繁体   English

正则表达式,省略几句话

[英]Regular expression, omit few words

How can I write a regular expression to match sth like that : 我该如何写一个正则表达式来匹配这样的东西:

he is capable of

he is not capable of 他不 能够

etc 等等

general pattern "he is" + up to few words + "of" 一般模式“他是” +最多几个单词+“的”

I know how to solve it without regular expression, but maybe there is easier way 我知道如何不使用正则表达式来解决它,但是也许有更简单的方法

A trivial solution would be to use 一个简单的解决方案是使用

\bhe is(?: \w+){1,3} of\b

which allows between one and three "words" between he is and of . 它允许一个和三个“词”之间he isof

\\w+ means "a sequence of letters/digits/underscores", so it doesn't exactly match a word, but you can substitute your own word-matching regex if that one is too unspecific. \\w+意思是“字母/数字/下划线序列”,因此它与一个单词不完全匹配,但是如果该单词太不明确,则可以替换您自己的单词匹配正则表达式。

The \\b word boundary anchors are used to only match he and of and not the or often . \\b 词边界锚仅用于匹配heof ,而不匹配theoften

如果您真的想检查是否有能力或没有能力。

"he is\\s+(not\\s+)?(capable\\s+)?of"

I'd go with this: 我会这样:

\bhe is\b.*\bof\b

I'm using \\b a lot to make sure I'm matching words. 我经常使用\\b来确保我匹配单词。 Eg this won't match She is capable of , neither he isa wizard capable of 例如,这与She is capable of不匹配, he isa wizard capable of也不he isa wizard capable of

This is a little more complicated: 这有点复杂:

\bhe is\b( +\w+ *)*\bof\b

Here we have the ( +\\w+ *)* in the middle. 在这里,中间有( +\\w+ *)* This makes sure that it matches words after each other. 这样可以确保单词彼此匹配。

You can play with the demo here . 您可以在此处进行演示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM