简体   繁体   中英

Regex Optional Capture Groups

The issue is to match something like hereunder

hyundai E&C Hillstate (KOR) - Heungkuk life insurance pink spiders (KOR)

Currently, the capture groups have been partially working although it stops when there is

.*

The current regex expression is:

  (hyundai){0,1}\s*(E&C){0,1}\s*(hillstate){0,1}.*(Heungkuk){0,1}.*(invalid){0,1}.*`

Please assume that ignore case is in place. With the above, it will match like so

Group #1 Length: 7 hyundai

Group #2 Length: 3 E&C

Group #3 Length: 9 Hillstate

Group #4 Length: 0

Group #5 Length: 0

Any advice would be greatly appreciated.

Another case for would be

  1. hyundai E&C Hillstate (KOR) v Heungkuk life insurance pink spiders (KOR)
  2. hyundai E&C Hillstate v Heungkuk life insurance pink spiders
  3. hyundai E&C Hillstate - Heungkuk life insurance pink spiders

The problem is that on my end we have something like hyundai E&C Hillstate v Heungkuk and then that is broken up into pieces.

These pieces are then to be compared to a string that is provided by a 3rd party being like hyundai E&C Hillstate (KOR) - Heungkuk life insurance pink spiders (KOR). In which case it will be noted that it was matched or not.

像这样的东西:( (hyundai){0,1}\\s*(E&C){0,1}\\s*(hillstate){0,1}\\s*(\\(KOR\\)){0,1}\\s*\\W\\s*(Heungkuk){0,1}(.*)

It seems that what you are looking for is named capture groups. The syntax would be

(((?<hy>hyundai)|(?<Korea>\(KOR\))|(?<delimiter>(v|-))|(?<heung>Heungkuk)|(?<invalid>\S+?))(\s+|$))+

Inspecting the capture groups can then tell you if a word was included in the line, and gice you it's position as well as the name of the group that captured it.

Note that not all your keywords are included in the above.

You could also consider changing (?<hy>hyundai)|(?<Korea>\\(KOR\\)) to (?<hy>hyundai( (?<hy-country>\\(KOR\\)))?) To insure that the (KOR) token does not occur indepenedtly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM