简体   繁体   中英

How to match an optional word between two other words?

Let's say I want to match words "green apple" . I also want to match words like "green big apple" .

How to write regular expression for this?

I wrote r"green [a-z+] apple" , but this doesn't work.

You were close, but your + is inside the [] instead of outside, and also the word may not exist so you need to wrap the entire thing (and one of the spaces) in a ? , to match one word or no word (can replace with * for any number of middle words).

import re

pattern =  r"green ([a-z]+ )?apple"
print(re.match(pattern, "green apple").group(0))
print(re.match(pattern, "green big apple").group(0))

Output:

green apple
green big apple

Pattern

This should work fine.

/(green)(.+)(apple)/

The answer to your question is not simple because you need to decide how you want to handle different scenarios. For example:

  • Do you want to capture everything in between the word "green" and "apple"?
  • Are capture groups relevant, or do you simply want to know if the two words occur in the given sequence where "green" comes before "apple"?
  • Is the sequence of said words even important, and do they have to occur in pairs, ie for every occurrence of "green" does there have to be one of "apple"?

Matching example

Below is an example of what the pattern will capture.

green apple

green big apple

banana green small blueberry apple orange strawberry

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM