简体   繁体   中英

Regular expression, omit few words

How can I write a regular expression to match sth like that :

he is capable of

he is not capable of

etc

general pattern "he is" + up to few words + "of"

I know how to solve it without regular expression, but maybe there is easier way

A trivial solution would be to use

\bhe is(?: \w+){1,3} of\b

which allows between one and three "words" between he is and of .

\\w+ means "a sequence of letters/digits/underscores", so it doesn't exactly match a word, but you can substitute your own word-matching regex if that one is too unspecific.

The \\b word boundary anchors are used to only match he and of and not the or often .

如果您真的想检查是否有能力或没有能力。

"he is\\s+(not\\s+)?(capable\\s+)?of"

I'd go with this:

\bhe is\b.*\bof\b

I'm using \\b a lot to make sure I'm matching words. Eg this won't match She is capable of , neither he isa wizard capable of

This is a little more complicated:

\bhe is\b( +\w+ *)*\bof\b

Here we have the ( +\\w+ *)* in the middle. This makes sure that it matches words after each other.

You can play with the demo here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM