简体   繁体   中英

Regex for matching a set of words

Is there a way to match a set of words in a sentence?

The requirement is I would like to check whether a sentence contains the following words po or po or po or box . But it shouldn't catch post or sandbox.

po --> error

post --> success

box --> error

hippo --> succes

Thanks in advance

A function that will return if a sentence contains any combination of the words "po", "po", or "box" (I've included the capitalizations of these as well): jsfiddle

function containsPObox(sentence) {
    var matches = sentence.match(/\bp\.?o\b\.?|\b(box)\b/gi);
    return (matches && matches.length > 0)?true:false;
}

Regarding the regex: /\bp\.?o\b\.?|\b(box)\b/gi

Breaking it down...

\b -> word boundary (first letter following a space or last letter before a space or period)

p -> 'p'

\.? -> optional '.'

o -> 'o'

\b -> word boundary

\.? -> optional '.'

| -> "or"

\b -> word boundary

(box) -> 'box'

\b -> word boundary

/g -> anywhere in the sentence

i -> case insensitive

This ought to do it:

/\b(p\.?o|box)\b/g
  • the first \b matches the beginning of a word group
  • the (. . . ) sets a matching group
  • the p\.?o is the first pattern, that matches a "p" and an "o" with an optional period (".") after the "p"
  • the "|" says to match the first pattern or the second pattern
  • the box is the second pattern, that matches just the word "box":)
  • the second \b matches the end of a word group
  • g makes the pattern "greedy" so it will match as much ass possible of the pattern

If you would like it to be case insensitive, include the "i" parameter at the end of the pattern:

/\b(p\.?o|box)\b/gi  <--- right here

Edit: to simplify the pattern, I removed the \.? that came after the "o". Since the "." would have to be the last character in the pattern, there is no difference in matching "po" and "po". . . if the next character after "po" is a period, or a space, it should match. If it is a letter (for example) it shouldn't, but the presence of the trailing "." is really irrelevant to the check.

Use \b to catch word boundaries.

The regex fragment:

\b(po|p\.o)\b

will only match if a sentence contains the word po or the word po

Like sh54's response, but without the | block.

\b(p\.?o\.?)\b

This will math po, po, box, and any combination that includes box.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM