简体   繁体   中英

Matching an optional substring in the middle of the string with a regex

I'm trying to create a regex for extracting title, subtitle and publisher. I was wondering how to making subtitle search optional.

My format is:

Title-(Subtitle)-[Publisher]

Where:

  • Title – is a string I want to capture in the 1st capturing group.
  • (Subtitle) – is an optional string surrounded by parenthesis I want to capture in the 2nd capturing group.
  • [Publisher] – is a string surrounded by square brackets I want to capture in the 3rd capturing group.

For example:

Programming.in.Python.3-(A.Complete.Introduction.to.the.Python.Language)-[Addison-Wesley]
Learning.Python-[O'Reilly]
Flask.Web.Development-(Developing.Web.Applications.with.Python)-[O'Reilly]

Right now, I have a regex ( see online ) that will capture the first and third one:

(.*)-\((.*)\)-\[(.*)\]

My problem is that I don't know how to construct a regex that will match also second line (Title in the 1st group, 2nd group should be empty and the 3rd group with Publisher) if it's doesn't have a subtitle enclosed in parenthesis. Can this be done in a single regex?

Just make the second capture optional using ?

(.*?)-(?:\((.*?)\)-)?\[(.*?)\]
       ^^^         ^^

Also I have replaced .* with .*? to avoid greedy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM